NUMBER THEORY REVEALED: 
AN INTRODUCTION 


NUMBER THEORY REVEALED: 
AN INTRODUCTION 


ANDREW GRANVILLE 


@ 
one AMERICAN 


+ A MS MATHEMATICAL 


SOCIETY 


Providence, Rhode Island 


Cover design by Marci Babineau. 


Front cover image of Srinivasa Ramanujan in the playing card: Oberwolfach Photo 
Collection, https: //opc.mfo.de/; licensed under Creative Commons Attribution Share 
Alike 2.0 Germany, https://creativecommons. org/licenses/by-sa/2.0/de/deed.en. 


Front cover image of Andrew Wiles in playing card, credit: Alain Goriely. 


2010 Mathematics Subject Classification. Primary 11-01, 11A05, 11A07, 11A15, 11A41, 
11A51, 11B39, 11D04, 11D07. 


For additional information and updates on this book, visit 
www.ams.org/bookpages/mbk-126 


Library of Congress Cataloging-in-Publication Data 


Cataloging-in-Publication Data has been applied for by the AMS. 
See http:www.loc.gov/publish/cip/. 


Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting 
for them, are permitted to make fair use of the material, such as to copy select pages for use 
in teaching or research. Permission is granted to quote brief passages from this publication in 
reviews, provided the customary acknowledgment of the source is given. 

Republication, systematic copying, or multiple reproduction of any material in this publication 
is permitted only under license from the American Mathematical Society. Requests for permission 
to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For 
more information, please visit www.ams.org/publications/pubpermissions. 

Send requests for translation rights and licensed reprints to reprint-permission@ams.org. 


© 2019 by the American Mathematical Society. All rights reserved. 
The American Mathematical Society retains all rights 


except those granted to the United States Government. 
Printed in the United States of America. 


The paper used in this book is acid-free and falls within the guidelines 
established to ensure permanence and durability. 
Visit the AMS home page at https://www.ams.org/ 


10987654321 24 23 22 21 20 19 


Dedicated to my beloved wife, Marci. 
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The enchanting charms of this sublime science 
reveal themselves only to those who have the 
courage to go deeply into it. 
CARL FRIEDRICH GAUSS, 1807 
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Preface 


This is a modern introduction to number theory, aimed at several different audi- 
ences: students who have little experience of university level mathematics, students 
who are completing an undergraduate degree in mathematics, as well as students 
who are completing a mathematics teaching qualification. Like most introductions 
to number theory, our contents are largely inspired by Gauss’s Disquisitiones Arith- 
meticae (1801), though we also include many modern developments. We have gone 
back to Gauss to borrow several excellent examples to highlight the theory. 


There are many different topics that might be included in an introductory 
course in number theory, and others, like the law of quadratic reciprocity, that surely 
must appear in any such course. The first dozen chapters of the book therefore 
present a “standard” course. In the masterclass version of this book we flesh out 
these topics, in copious appendices, as well as adding five additional chapters on 
more advanced themes. In the introductory version we select_an appendix for each 
chapter that might be most useful as supplementary material] A “minimal” course 
might focus on the first eight chapters and at least one of chapters 9 and 102] 


Much of modern mathematics germinated from number-theoretic seed and one 
of our goals is to help the student appreciate the connection between the relatively 
simply defined concepts in number theory and their more abstract generalizations 
in other courses. For example, our appendices allow us to highlight how mod- 
ern algebra stems from investigations into number theory and therefore serve as 
an introduction to algebra (including rings, modules, ideals, Galois theory, p-adic 
numbers,...). These appendices can be given as additional reading, perhaps as 
student projects, and we point the reader to further references. 

Following Gauss, we often develop examples before giving a formal definition 
and a theorem, firstly to see how the concept arises naturally, secondly to conjecture 
a theorem that describes an evident pattern, and thirdly to see how a proof of the 
theorem emerges from understanding some non-trivial examples. 


1Tn the main text we occasionally refer to appendices that only appear in the masterclass version. 
? Several sections might be discarded; their headings are in bold italics. 
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xiv Preface 


Why study number theory? Questions arise when studying any subject, some- 
times fascinating questions that may be difficult to answer precisely. Number theory 
is the study of the most basic properties of the integers, literally taking integers 
apart to see how they are built, and there we find an internal beauty and coherence 
that encourages many of us to seek to understand more. Facts are often revealed by 
calculations, and then researchers seek proofs. Sometimes the proofs themselves, 
even more than the theorems they prove, have an elegance that is beguiling and 
reveal that there is so much more to understand. With good reason, Gauss called 
number theory the “Queen of Mathematics”, ever mysterious, but nonetheless gra- 
ciously sharing with those that find themselves interested. In this first course there 
is much that is accessible, while at the same time natural, easily framed, questions 
arise which remain open, stumping the brightest minds. 


Once celebrated as one of the more abstract subjects in mathematics, today 
there are scores of applications of number theory in the real world, particularly to 
the theory and practice of computer algorithms. Best known is the use of number 
theory in designing cryptographic protocols (as discussed in chapter 10), hiding 
our secrets behind the seeming difficulty of factoring large numbers which only 
have large prime factors. 


For some students, studying number theory is a life-changing experience: They 
find themselves excited to go on to penetrate more deeply, or perhaps to pursue 
some of the fascinating applications of the subject. 


Why give proofs? We give proofs to convince ourselves and others that our 
reasoning is correct. Starting from agreed upon truths, we try to derive a further 
truth, being explicit and precise about each step of our reasoning. A proof must 
be readable by people besides the author. It is a way of communicating ideas and 
needs to be persuasive, not just to the writer but also to a mathematically literate 
person who cannot obtain further clarification from the writer on any point that is 
unclear. It is not enough that the writer believes it; it must be clear to others. The 
burden of proof lies with the author. 


The word “proof” can mean different things in different disciplines. In some 
disciplines a “proof” can be several different examples that justify a stated hypoth- 
esis, but this is inadequate in mathematics: One can have a thousand examples that 
work as predicted by the hypothesis, but the thousand and first might contradict 
it. Therefore to “prove” a theorem, one must build an incontrovertible argument 
up from first principles, so that the statement must be true in every case, assuming 
that those first principles are true. 


Occasionally we give more than one proof of an important theorem, to highlight 
how inevitably the subject develops, as well as to give the instructor different 
options for how to present the material. (Few students will benefit from seeing 
all of the proofs on their first time encountering this material.) 


Motivation. Challenging mathematics courses, such as point-set topology, al- 
gebraic topology, measure theory, differential geometry, and so on, tend to be dom- 
inated at first by formal language and requirements. Little is given by way of 
motivation. Sometimes these courses are presented as a prerequisite for topics that 
will come later. There is little or no attempt to explain what all this theory is good 
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for or why it was developed in the first place. Students are expected to subject 
themselves to the course, motivated primarily by trust. 


How boring! Mathematics surely should not be developed only for those few 
who already know that they wish to specialize and have a high tolerance for bore- 
dom. We should help our students to appreciate and cherish the beauty of math- 
ematics. Surely courses should be motivated by a series of interesting questions. 
The right questions will highlight the benefits of an abstract framework, so that 
the student will wish to explore even the most rarified paths herself, as the bene- 
fits become obvious. Number theory does not require much in the way of formal 
prerequisites, and there are easy ways to justify most of its abstraction. 


In this book, we hope to capture the attention and enthusiasm of the reader 
with the right questions, guiding her as she embarks for the first time on this 
fascinating journey. 


Student expectations. For some students, number theory is their first course 
that formulates abstract statements of theorems, which can take them outside of 
their “comfort zone”. This can be quite a challenge, especially as high school 
pedagogy moves increasingly to training students to learn and use sophisticated 
techniques, rather than appreciate how those techniques arose. We believe that 
one can best use (and adapt) methods if one fully appreciates their genesis, so 
we make no apologies for this feature of the elementary number theory course. 
However this means that some students will be forced to adjust their personal 
expectations. Future teachers sometimes ask why they need to learn material, 
and take a perspective, so far beyond what they will be expected to teach in high 
school. There are many answers to this question; one is that, in the long term, the 
material in high school will be more fulfilling if one can see its long-term purpose. A 
second response is that every teacher will be confronted by students who are bored 
with their high school course and desperately seeking harder intellectual challenges 
(whether they realize it themselves or not); the first few chapters of this book should 
provide the kind of intellectual stimulation those students need. 


Exercises. Throughout the book, there are a lot of problems to be solved. Easy 
questions, moderate questions, hard questions, exceptionally difficult questions. No 
one should do them all. The idea of having so many problems is to give the teacher 
options that are suitable for the students’ backgrounds: 


An unusual feature of the book is that exercises appear embedded in the text | 
This is done to enable the student to complete the proofs of theorems as one goes 
along} This does not require the students to come up with new ideas but rather to 
follow the arguments given so as to fill in the gaps. For less experienced students it 
helps to write out the solutions to these exercises; more experienced students might 
just satisfy themselves that they can provide an appropriate proof. 


Though they can be downloaded, as a separate list, from www.ams.org/granville-number-theory. 
“Often students have little experience with proofs and struggle with the level of sophistication 
required, at least without adequate guidance. 
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Other questions work through examples. There are more challenging exercises 
throughout, indicated by the symbol ‘ next to the question numbers, in which the 
student will need to independently bring together several of the ideas that have been 
discussed. Then there are some really tough questions, indicated by the symbol -, 
in which the student will need to be creative, perhaps even providing ideas not 
given, or hinted at, in the text. 


A few questions in this book are open-ended, some even phrased a little mis- 
leadingly. The student who tries to develop those themes her- or himself, might 
embark upon a rewarding voyage of discovery. Once, after I had set the exercises 
in section [9.2] for homework, some students complained how unfair they felt these 
questions were but were silenced by another student who announced that it was so 
much fun for him to work out the answers that he now knew what he wanted to do 
with his life! 


At the end of the book we give hints for many of the exercises, especially those 
that form part of a proof. 


Special features of our syllabus. Number theory sometimes serves as an intro- 
duction to “proof techniques”. We give many exercises to practice those techniques, 
but to make it less boring, we do so while developing certain themes as the book 
progresses, for examples, the theory of recurrence sequences, and properties of bi- 
nomial coefficients. We dedicate a preliminary chapter to induction and use it to 
develop the theory of sums of powers. Here is a list of the main supplementary 
themes which appear in the book: 


Special numbers: Bernoulli numbers; binomial coefficients and Pascal’s triangle; 
Fermat and Mersenne numbers; and the Fibonacci sequence and general second- 
order linear recurrences. 


Subjects in their own right: Algebraic numbers, integers, and units; compu- 
tation and running times: Continued fractions; dynamics; groups, especially of 
matrices; factoring methods and primality testing; ideals; irrationals and transcen- 
dentals; and rings and fields. 


Formulas for cyclotomic polynomials, Dirichlet L-functions, the Riemann zeta- 
function, and sums of powers of integers. 


Interesting issues: Lifting solutions; polynomial properties; resultants and dis- 
criminants; roots of polynomials, constructibility and pre-Galois theory; square 
roots (mod n); and tests for divisibility. 


Fun and famous problems like the abc-conjecture, Catalan’s conjecture, Egyp- 
tian fractions, Fermat’s Last Theorem, the Frobenius postage stamp problem, magic 
squares, primes in arithmetic progressions, tiling with rectangles and with circles. 


Our most unconventional choice is to give a version of Rousseau’s proof of the 
law of quadratic reciprocity, which is directly motivated by Gauss’s proof of Wil- 
son’s Theorem. This proof avoids Gauss’s Lemma so is a lot easier for a beginning 
student than Eisenstein’s elegant proof (which we give in section [8.10] of appendix 
8A). Gauss’s original proof of quadratic reciprocity is more motivated by the in- 
troductory material, although a bit more complicated than these other two proofs. 
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We include Gauss’s original proof in section of appendix 8C, and we also un- 
derstand (2/n) in his way, in the basic course, to interest the reader. We present 
several other proofs, including a particularly elegant proof using Gauss sums in 
section [14.7 


Further exploration of number theory. There is a tremendous leap in the level 
of mathematical knowledge required to take graduate courses in number theory, 
because curricula expect the student to have taken (and appreciated) several other 
relevant courses. This is a shame since there is so much beautiful advanced material 
that is easily accessible after finishing an introductory course. Moreover, it can be 
easier to study other courses, if one already understands their importance, rather 
than taking it on trust. Thus this book, Number Theory Revealed, is designed 
to lead to two subsequent books, which develop the two main thrusts of number 
theory research: 


In The distribution of primes: Analytic number theory revealed, we will discuss 
how number theorists have sought to develop the themes of chapter 5 (as well as 
chapters 4 and 13). In particular we prove the prime number theorem, based 
on the extraordinary ideas of Riemann. This proof rests heavily on certain ideas 
from complex analysis, which we will outline in a way that is relevant for a good 
understanding of the proofs. 


In Rational points on curves: Arithmetic geometry revealed, we look at solu- 
tions to Diophantine equations, especially those of degree two and three, extending 
the ideas of chapter 12 (as well as chapters 14 and 17). In particular we will prove 
Mordell’s Theorem (developed here in special cases in chapter 17) and gain a basic 
understanding of modular forms, outlining some of the main steps in Wiles’s proof 
of Fermat’s Last Theorem. We avoid a deep understanding of algebraic geometry, 
instead proceeding by more elementary techniques and a little complex analysis 
(which we explain). 


References. There is a list of great number theory books at the end of our 
book and references that are recommended for further reading at the end of many 
chapters and appendices. Unlike most textbooks, I have chosen to not include a 
reference to every result stated, nor necessarily to most relevant articles, but rather 
focus on a smaller number that might be accessible to the reader. Moreover, many 
readers are used_to searching online for keywords; this works well for many themes 
in mathematics) However the student researching online should be warned that 
Wikipedia articles are often out of date, sometimes misleading, and too often poorly 
written. It is best to try to find relevant articles published in expository research 
journals, such as the American Mathematical M. onthly|4| or posted at arxiv.org which 
is “open access”, to supplement the course material. 


The cover (designed by Marci Babineau and the author). 


In 1675, Isaac Newton explained his extraordinary breakthroughs in physics and 
mathematics by claiming, “Jf I have seen further it is by standing on the shoulders 


5 Though getting just the phrasing to find the right level of article can be challenging. 
® Although this is behind a paywall, it can be accessed, like many journals, by logging on from most 
universities, which have paid subscriptions for their students and faculty. 
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of Giants.” Science has always developed this way, no more so than in the theory 
of numbers. Our cover represents five giants of number theory, in a fan of cards, 
each of whose work built upon the previous luminaries. 


Modern number theory was born from PIERRE DE FERMAT’s readings of the 
ancient Greek texts (as discussed in section [6.1) in the mid-17th century, and his 
enunciation of various results including his tantalizingly difficult to prove “Last 
Theorem.” His “Little Theorem” (chapter 7) and his understanding of sums of two 
squares (chapter 9) are part of the basis of the subject. 


The first modern number theory book, Gauss’s Disquisitiones Arithmeticae, on 
which this book is based, was written by CARL FRIEDRICH GAUSS at the beginning 
of the 19th century. As a teenager, Gauss rethought many of the key ideas in number 
theory, especially the law of quadratic reciprocity (chapter 8) and the theory of 
binary quadratic forms (chapter 12), as well as inspiring our understanding of the 
distribution of primes (chapter 5). 


Gauss’s contemporary SOPHIE GERMAIN made perhaps the first great effort to 
attack Fermat’s Last Theorem (her effort is discussed in appendix 7F'). Developing 
her work inspired my own first research efforts. 


SRINIVASA RAMANUJAN, born in poverty in India at the end of the 19th cen- 
tury, was the most talented untrained mathematician in history, producing some 
extraordinary results before dying at the age of 32. He was unable to satisfactorily 
explain many of his extraordinary insights which penetrated difficult subjects far 
beyond the more conventional approaches. (See appendix 12F and chapters 13, 15, 
and 17.) Some of his identities are still inspiring major developments today in both 
mathematics and physics. 


ANDREW WILES sits atop our deck. His 1994 proof of Fermat’s Last Theorem 
built on the ideas of the previous four mentioned mathematicians and very many 
other “giants” besides. His great achievement is a testament to the success of 
science building on solid grounds. 


Thanks. I would like to thank the many inspiring mathematicians who have 
helped me shape my view of elementary number theory, most particularly Bela 
Bollobas, Paul Erdés, D. H. Lehmer, James Maynard, Ken Ono, Paulo Riben- 
boim, Carl Pomerance, John Selfridge, Dan Shanks, and Hugh C. Williams as well 
as those people who have participated in developing the relatively new subject of 
“additive combinatorics” (see sections [15.3) [15.4] and [15.6). Several peo- 
ple have shared insights or new works that have made their way into this book: 
Stephanie Chan, Leo Goldmakher, Richard Hill, Alex Kontorovich, Jennifer Park, 
and Richard Pinch. The six anonymous reviewers added some missing perspec- 
tives and Olga Balkanova, Stephanie Chan, Patrick Da Silva, Tristan Freiberg, 
Ben Green, Mariah Hamel, Jorge Jimenez, Nikoleta Kalaydzhieva, Dimitris Kouk- 
oulopoulos, Youness Lamzouri, Jennifer Park, Sam Porritt, Ethan Smith, Anitha 
Srinivasan, Paul Voutier, and Max Wenqiang Xu kindly read subsections of the 
near-final draft, making valuable comments. 


Gauss’s Disquisitiones 
Arithmeticae 


In July 1801, Carl Friedrich Gauss published Disquisitiones Arithmeticae, a book 
on number theory, written in Latin. It had taken five years to write but was im- 
mediately recognized as a great work, both for the new ideas and its accessible 
presentation. Gauss was then widely considered to be the world’s leading mathe- 
matician, and today we rate him as one of the three greatest in history, alongside 
Archimedes and Sir Isaac Newton. 


The first four chapters of Disquisitiones Arithmeticae consist of essentially the 
same topics as our course today (with suitable modifications for advances made in 
the last two hundred years). His presentation of ideas is largely the model upon 
which modern mathematical writing is based. There follow several chapters on qua- 
dratic forms and then on the rudiments of what we would call Galois theory today, 
most importantly the constructibility of regular polygons. Finally, the publisher 
felt that the book was long enough, and several further chapters did not appear in 
the book (though Dedekind published Gauss’s disorganized notes, in German, after 
Gauss’s death). 


One cannot overestimate the importance of Disquisitiones to the development 
of 19th-century mathematics. It led, besides many other things, to Dirichlet’s 
formulation of ideals (see sections 3.19] [3.20] of appendix 3D, [12.8)of appendix 12A, 
and of appendix 12B), and the exploration of the geometry of the upper 
half-plane (see Theorem [1.2] and the subsequent discussion). 


As a young man, Dirichlet took his copy of Disquisitiones with him wherever 
he went. He even slept with it under his pillow. As an old man, it was his most 
prized possession even though it was in tatters. It was translated into French in 
1807, German in 1889, Russian in 1959, English only in 1965, Spanish and Japanese 
in 1995, and Catalan in 1996! 
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XX Gauss’s Disquisitiones Arithmeticae 


Disquisitiones is no longer read by many people. The notation is difficult. The 
assumptions about what the reader knows do not fit today’s reader (for example, 
neither linear algebra nor group theory had been formulated by the time Gauss 
wrote his book, although Disquisitiones would provide some of the motivation for 
developing those subjects). Yet, many of Gauss’s proofs are inspiring, and some 
have been lost to today’s literature. Moreover, although the more advanced two- 
thirds of Disquisitiones focus on binary quadratic forms and have led to many of 
today’s developments, there are several themes there that are not central to today’s 
research. In the fourth book in our trilogy (!), Gauss’s Disquisitiones Arithmeticae 
revealed, we present a reworking of Gauss’s classic, rewriting it in modern notation, 
in a style more accessible to the modern reader. We also give the first English 
version of the missing chapters, which include several surprises. 


Notation 


N — The natural numbers, 1, 2,3,.... 

Z — The integers, ...,—3, —2, -1,0,1,2,3,.... 
Throughout, all variables are taken to be integers, unless otherwise specified. 
Usually p, and sometimes q, will denote prime numbers. 

Q - The rational numbers, that is, the fractions a/b with a € Z and bEN. 

R — The real numbers. 


C — The complex numbers. 


y summand and Il summand 
Some variables: Some variables: 
Certain conditions hold Certain conditions hold 


mean that we sum, or product, the summand over the integer values of some vari- 
able, satisfying certain conditions. 

Brackets and parentheses: There are all sorts of brackets and parentheses in math- 
ematics. It is helpful to have protocols with them that take on meaning, so we do 
not have to repeat ourselves too often, as we will see in the notation below. But we 
also use them in equations; usually we surround an expression with “(” and “)” to 
be clear where the expression begins and ends. If too many of these are used in one 
line, then we might use different sizes or even “{” and “}” instead. If the brackets 
have a particular meaning, then the reader will be expected to discern that from 
the context. 


A{x] — The set of polynomials with coefficients from the set A, that is, f(x) = 
an a,x’ where each a; € A. Mostly we work with A = Z. 


A(x) —The set of rational functions with coefficients from the set A, in other words, 
functions f(x)/g(x) where f(x), g(x) € A[z] and g(x) £0. 


[t) — The integer part of t, that is, the largest integer < t. 
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{t} — The fractional part of (real number) t, that is, {t} = t — [t]. Notice that 
O<{t}<1. 

(a,b) — The greatest common divisor of a and 6. 

(a, b] — The least common multiple of a and b. 

b|a — Means b divides a. 

p*|\|a — Means p* divides a, but not p**! (where p is prime). In other words, k is 
the “exact power” of p dividing a. 

I(a,b) — The set {am + bn: m,n € Z}, which is called the ideal generated by a 
and b over Z. 

log — The logarithm in base e, the natural logarithm, which is often denoted by 
“In” in earlier courses. 

Parity — The parity of an integer is either even (if it is divisible by 2) or odd (if it 
is not divisible by 2). 


The language of mathematics 


“By a conjecture we mean a proposition that has not yet been proven but which is 
favored by some serious evidence. It may be a significant amount of computational 
evidence, or a body of theory and technique that has arisen in the attempt to settle 
the conjecture. 


An open question is a problem where the evidence is not very convincing one 
way or the other. 


A theorem, of course, is something that has been proved. There are important 
theorems, and there are unimportant (but perhaps curious) theorems. 


The distinction between open question and conjecture is, it is true, somewhat 
subjective, and different mathematicians may form different judgements concerning 
a particular problem. We trust that there will be no similar ambiguity concerning 
the theorems.” 


—— Dan Shanks [Sha85] p. 2] 
Today we might add to this a heuristic argument, in which we explore an open 


question with techniques that help give us a good idea of what to conjecture, even 
if those techniques are unlikely to lead to a formal proof. 


Prerequisites 


The reader should be familiar with the commonly used sets of numbers N, Z, and Q, 
as well as polynomials with integer coefficients, denoted by Z|]. Proofs will often 
use the principle of induction; that is, if S(n) is a given mathematical assertion, 
dependent on the integer n, then to prove that it is true for all n € N, we need only 
prove the following: 


e S(1) is true. 

e S(k) is true implies that S(k +1) is true, for all integers k > 1. 
The example that is usually given to highlight the principle of induction is the 
statement “1+2+3+---4+n= aint)» which we denote by S(n)H For n = 1 we 


check that 1 = +2 and so S$(1) is true. For any k > 1, we assume that S(k) is true 
and then deduce that 


14+24+34-:-+(k4+1) = (14+243+4+---+k) + (F4+1) 
—$_—_—<— 


= lla) -- (k+1) as S(k) is true 


2 
(K+ 1)(K+2) | 
9 s] 


that is, S(k +1) is true. Hence, by the principle of induction, we deduce that S(n) 
is true for all integers n > 1. 


To highlight the technique of induction with more examples, we develop the 
theory of sums of powers of integers (for example, we prove a statement which 
gives a formula for 17 + 2? +---+ n? for each integer n > 1) in section 0.1 and 
give formulas for the values of the terms of recurrence sequences (like the Fibonacci 
numbers) in section 0.2. 


1There are other, easier, proofs of this assertion, but induction will be the only viable technique 
to prove some of the more difficult theorems in the course, which is why we highlight the technique here. 
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Induction and the least counterexample: Induction can be slightly disguised. For 
example, sometimes one proves that a statement T(n) is true for all n > 1, by 
supposing that it is false for some n and looking for a contradiction. If T(n) is 
false for some n, then there must be a least integer m for which T(m) is false. The 
trick is to use the assumption that T(m) is false to prove that there exists some 
smaller integer k, 1 < k < m, for which T(k) is also false. This contradicts the 
minimality of m, and therefore T'(n) must be true for all n > 1. Such proofs are 
easily reformulated into an induction proof: 


Let S(n) be the statement that T(1),7T(2),...,T(m) all hold. The induction 
proof then works for if S(m — 1) is true, but S(m) is false, then T(m) is false and 
so, by the previous paragraph, T(k) is false for some integer k, 1 < k < m—1, 
which contradicts the assumption that S(m — 1) is true. 


A beautiful example is given by the statement, “Every integer > 1 has a prime 
divisor.” (A prime number is an integer > 1, such that the only positive integers 
that divide it are 1 and itself.) Let T(n) be the statement that n has a prime 
divisor, and let S(n) be the statement that T(2),7(3),...,7(m) all hold. Evidently 
S(2) = T(2) is true since 2 is prime. We suppose that S(k) is true (so that 
T(2),7(3),...,7(&) all hold). Now: 

Either k+1 is itself a prime number, in which case T(k+1) holds and therefore 
S(k +1) holds. 


Or k+1 is not prime, in which case it has a divisor d which is not equal to either 
lor k+1, and so2<d<k. But then S(d) holds by the induction hypothesis, 
and so there is some prime p, which divides d, and therefore divides k + 1. Hence 
T(k +1) holds and therefore S(k +1) holds. 


(The astute reader might ask whether certain “facts” that we have used here deserve 
a proof. For example, if a prime p divides d, and d divides k + 1, then p divides 
k}+1. We have also assumed the reader understands that when we write “d divides 
k +1” we mean that when we divide k + 1 by d, the remainder is zero. One of our 
goals at the beginning of the course is to make sure that everyone interprets these 
simple facts in the same way, by giving as clear definitions as possible and outlining 
useful, simple deductions from these definitions.) 


Chapter 0 


Preliminary Chapter 
on Induction 


Induction is an important proof technique in number theory. This preliminary 
chapter gives the reader the opportunity to practice its use, while learning about 
some intriguing number-theoretic concepts. 


0.1. Fibonacci numbers and other recurrence sequences 


The Fibonacci numbers, perhaps the most famous sequence of integers, begin with 


Fo =0, F, =1, Fh =1, Fs =2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233,.... 


The Fibonacci numbers appear in many places in mathematics and its applica- 
tions[}| They obey a rule giving each term of the Fibonacci sequence in terms of 
the recent history of the sequence: 


Fy, = Fy-1+Fy_-2 for all integers n > 2. 


We call this a recurrence relation. It is not difficult to find a formula for F;,: 


1 = n - n 
(0.1.1) r= <= (( 4) (: 4) for all integers n > 0, 


where = and =o each satisfy the equation x+1 = x”. Having such an explicit 
formula for the Fibonacci numbers makes them easy to work with, but there is a 
problem. It is not obvious from this formula that every Fibonacci number is an 
integer; however that does follow easily from the original recurrence relation] 


1 Typically when considering a biological process whose current state depends on its past, such as 
evolution, and brain development. 

Tt requires quite sophisticated ideas to decide whether a given complicated formula like is 
an integer or not. Learn more about this in appendix OF on symmetric polynomials. 
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Exercise 0.1.1. (a) Use the recurrence relation for the Fibonacci numbers, and induction to 
prove that every Fibonacci number is an integer. 
(b) Prove that is correct by verifying that it holds for n = 0,1 and then, for all larger 
integers n, by induction. 


Exercise 0.1.2. Use induction on n > 1 to prove that 
(a) Fy + F3 +---+ Fon-1 = Fon and 
(b) 14+ Fo t+ Fyt--+-+ Fon = Fond. 


The number ¢ = 146 is called the golden ratio; one can show that F;, is the 
nearest integer to ¢”/V5. 


Exercise 0.1.3. (a) Prove that ¢ satisfies 6? = ¢ +1. 
(b) Prove that ¢” = Fy¢+ Fn-—1 for all integers n > 1, by induction. 


Any sequence 2%, 21, %2,..., for which the terms x,, with n > 2, are defined 
by the equation 


(0.1.2) In = Ayn, + bry_2 for all n > 2, 


where a,b,29,21 are given, is called a second-order linear recurrence sequence. 
Although this is a vast generalization of the Fibonacci numbers one can still prove 
a formula for the general term, vp, analogous to (0.1.1): We begin by factoring the 
polynomial 

a? —ax —b= (2 —a)(x— 8) 


for the appropriate a, 8 € C (we had 2? -—2-—1= (a 14V5) (x 1/5) for the 
Fibonacci numbers). If a 4 2, then there exist coefficients co, cg for which 


0.1.3 In = Cg" +38" for all n > 0. 
B 


(In the case of the Fibonacci numbers, we have cg = 1/V5 and cg = —1/V5.) 
Moreover one can determine the values of cq and cg by solving the simultaneous 
equations obtained by evaluating the formula at n =0 and n = 1, that is, 


Ca t+ eg =X and cga+cgB = 21. 


Exercise 0.1.4. (a) Prove (0.1.3) is correct by verifying that it holds for n = 0,1 (with wp and 
x1 as in the last displayed equation) and then by induction for n > 2. 
(b) Show that cq and cg are uniquely determined by xo and 21, provided a # B. 


(c) Show that if a 4 6 with zo = 0 and x; = 1, then zp = oe for all integers n > 0. 


(d) Show that if a 4 8 with yo = 2,y1 = a with yn = ayn—1 + byn—2 for all n > 2, then 
yn =a” 4+ B” for all integers n > 0. 


The {2n}n>o in (c) is a Lucas sequence, and the {yn}n>0 in (d) its companion sequence 


Exercise 0.1.53) (a) Prove that a = @ if and only if a? + 4b =0. 
(b)* Show that if a? + 4b = 0, then a = a/2 and x, = (cn + d)a” for all integers n > 0, for 
some constants c and d. 
(c) Deduce that if a = 8 with a9 = 0 and 2; = 1, then z, = na”! for all n > 0. 


Exercise 0.1.6. Prove that if 29 = 0 and x; = 1, if (0.1.2) holds, and if a is a root of x? —ax—b, 
then a” = ary + ban—1 for alln > 1. 


3In this question, and from here on, induction should be used at the reader’s discretion. 


0.2. Formulas for sums of powers of integers 3 


0.2. Formulas for sums of powers of integers 


When Gauss was ten years old, his mathematics teacher aimed to keep his class 
quiet by asking them to add together the integers from 1 to 100. Gauss did this in 
a few moments, by noting if one adds that list of numbers to itself, but with the 
second list in reverse order, then one has 


14+ 100 =2+99=3+98=---=99+2=100+1=101. 


That is, twice the asked-for sum equals 100 times 101, and so 
1 
14+2+---4+100 = 5X 100 x 101. 


This argument generalizes to adding up the natural numbers less than any given 
N, yielding the formuld4 


(0.2.1) res eee 


The sum on the left-hand side of this equation varies in length with N, whereas 
the right-hand side does not. The right-hand side is a formula whose value varies 
but has a relatively simple structure, so we call it a closed form expression. (In the 
prerequisite section, we gave a less interesting proof of this formula, by induction.) 


Exercise 0.2.1. (a) Prove that 1+3+5+4+--+.+(2N—1)=N? for all N > 1 by induction. 
(b) Prove the formula in part (a) by the young Gauss’s method. 
(c) Start with a single dot, thought of as a 1-by-1 array of dots, and extend it to a 2-by-2 array 
of dots by adding an appropriate row and column. You have added 3 dots to the original 
dot and so 1+3=2?. 


° O° O° 


1+ 3+ 5 +--. 


In general, draw an N-by-N array of dots, and add an additional row and column of dots 
to obtain an (N + 1)-by-(N +1) array of dots. By determining how many dots were added 
to the number of dots that were already in the array, deduce the formula in (a). 


Let S = nar n?. Using exercise [0.2.1] we can write each square, n?, as the 
sum of the odd positive integers < 2n. Therefore 2m — 1 appears N — m times in 
the sum for S, and so 


aes | N-1 N-1 
S= > (2m-1)(N—m) =-N 571+ (2N +1) 5) m-28. 
m=1 m=1 m=1 


4This same idea appears in the work of Archimedes, from the third century B.C. in ancient Greece. 
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Using our closed formula for 5>,,,m, we deduce, after some rearrangement, that 


gS NON =i 
a ( _ ) 


a closed formula for the sum of the squares up to a given point. There is also a 
closed formula for the sum of the cubes: 


(0.2.2) 3 = ay 


n=1 


This is the square of the closed formula (0.2.1) that we obtained for poe n. Is 


this a coincidence or the first hint of some surprising connection? 


Exercise 0.2.2. Prove these last two formulas by induction. 


These three examples suggest that there are closed formulas for the sums of the 
kth powers of the integers, for every k > 1, but it is difficult to guess exactly what 
those formulas might look like. Moreover, to hope to prove a formula by induction, 
we need to have the formula at hand. 


We will next find a closed formula in a simpler but related question and use this 
to find a closed formula for the sums of the kth powers of the integers in appendix 
OA. We will go on to investigate, in section [7.34] of appendix 7I, whether there are 
other amazing identities for sums of different powers, like 

N-1 N-1 \? 
y= (Se). 
n=1 n=1 
0.3. The binomial theorem, Pascal’s triangle, and the binomial 
coefficients 


The binomial coefficient (") is defined to be the number of different ways of choos- 
ing m objects from n. (Therefore ( = 0 whenever m < 0 or m > n.) From this 
definition we see that the binomial coefficients are all integers. To determine (3) we 
note that there are 5 choices for the first object and 4 for the second, but then we 


have counted each pair of objects twice (since we can select them in either order), 


and so (3) = 244. It is arguably nicer to write 5 x 4 as 2*4%3x2x1 — & so that 
(3) = nae One can develop this proof to show that, for any integers 0 < m <n, 
one has the very neat formuld)| 
n n! 
= — l=r. = ee 
(0.3.1) (") > aieann where r! =r-(r—1)---2-1. 


From this formula alone it is not obvious that the binomial coefficients are integers. 


Exercise 0.3.1. (a) Prove that eo) = (")+(,,” 1) for all integers m, and all integers n > 0. 


m m—1 
n\: ‘ 
(b) Deduce from (a) that each (”) is an integer. 
° We prefer to work with the closed formula 27!/(15!12!) rather than to evaluate it as 17383860, since 


the three factorials are easier to appreciate and to manipulate in subsequent calculations, particularly 
when looking for patterns. 
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Pascal’s triangle is a triangular array in which the (n + 1)st row contains the 
binomial coefficients (h, with m increasing from 0 to n, as one goes from left to 
right: 


15101051 
1615 20 1561 
. etc. 


The addition formula in exercise |Q.3.I{a) yields a rule for obtaining a row from the 
previous one, by adding any two neighboring entries to give the entry immediately 
below. For example the third entry in the bottom row is immediately below 5 and 
10 (to either side) and so equals 5 + 10 = 15. The next entry is 10 + 10 = 20, etc. 


The binomial theorem states that if n is an integer > 1, then 


wr =3 (“env 


m=0 


Exercise 0.3.2.1 Using exercise a) and induction on n > 1, prove the binomial theorem. 


Notice that one can read off the coefficients of (x + y)” from the (n+ 1)st row 
of Pascal’s triangle; for example, reading off the bottom row above (which is the 
7th row down of Pascal’s triangle), we obtain 


(a + y)® = 2° + 6r°y + 15aty? + 203 y? + 15a7y* + 6ry® + y®. 


In the previous section we raised the question of finding a closed formula for 
the sum of n*, over all positive integers n < N. We can make headway in a related 
question in which we replace n* with a different polynomial in n of degree k, namely 
the binomial coefficient 


(") _ Asa, 


This is a polynomial of degree & in n. For example, we have C) = ne - a +3, 8 
polynomial in n of degree 3. We can identify a closed formula for the sum of these 
binomial coefficients, over all positive integers n < N, namely: 


(0.3.2) y (;) = Ge 
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for all N and k > 0. For k = 2, N =6, this can be seen in the following diagram: 


so that 1+3+6+ 10 equals 20. 


Exercise 0.3.3. Prove (0.3.2) for each fixed k > 1, for each N > k +1, using induction and 
exercise You might also try to prove it by induction using the idea behind the illustration 
in the last diagram. 


If we instead display Pascal’s triangle by lining up the initial 1’s and then 
summing the diagonals, 
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1 
1 1 
1 2 1 
1 3 3 
1 4 ©) 1 
1 G) 10 
G) 6 15 
the sums are 1, 1,14+1,14+2,14+34+1,14+4+43,14+5+4+6+1,... which equal 
1, 1, 2, 3, 5, 8, 18,..., the Fibonacci numbers. It therefore seems likely that 


n-1 
—1l-k 
(0.3.3) na A ) for att > 1 


k=0 


Exercise 0.3.4. Prove (0.3.3) for each integer n > 1, by induction using exercise [0.3.1[a). 


Articles with further thoughts on factorials and binomial coefficients 


[1] Manjul Bhargava, The factorial function and generalizations, Amer. Math. Monthly 107 (2000), 
783-799 (preprint). 


[2] John J. Watkins, chapter 5 of Number theory. A historical approach, Princeton University Press, 
2014. 
Additional exercises 
Exercise 0.4.1. (a) Prove that for all nm > 1 we have 
i a\" — (Fr+i Fn 
1 0 ~ Fn Fri ; 
(b) Deduce that Fn41Fn—1 — F? = (—1)” for alln > 1. 
(c) Deduce that Fes — Fn4ifn — F? = (-1)” for alln > 0. 
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Exercise 0.4.2.1 Deduce from (0.1.1) that the Fibonacci number Fy, is the nearest integer to 


o¢"/\/5, for each integer n > 0, where the constant ¢ := Ae This golden ratio appears in art 
and architecture when attempting to describe “perfect proportions”. 


Exercise 0.4.3. Prove that F? + Fs = 


2(F2 4 | F2,5) for all n > 0. 
Exercise 0.4.4. Prove that for all n > 1 we have 
Fon-1 = F2_, + F?2 and Fan = F2,, — Fy. 


Exercise 0.4.5. Use (0.1.1) to prove the following: 
(a) For every r we have F? — Fy4-Fn—r = (—1)"~"F? for alln>r. 
(b) For all m > n> 0 we have Fy, Fr41 — Fm4iFn = (-1)"Fim-n- 


Exercise 0.4.6. Let uo = b and un+1 = aun for all n > 0. Give a formula for all un with n > 0. 


The expression 011010 is a string of 0’s and 1’s. There are 2” strings of 0’s and 
1’s of length n as there are two possibilities for each entry. Let A, be the set of 
strings of 0’s and 1’s of length n which contain no two consecutive 1’s. Our example 
011010 does not belong to Ag as the second and third characters are consecutive 
1’s, whereas 01001010 is in Ag. Calculations reveal that |Ai| = 2, |A2| = 3, and 
|A3| = 5, data which suggests that perhaps |A,| = Fy+2, the Fibonacci number. 


Exercise 0.4.7.1 (a) If Ow is a string of 0’s and 1’s of length n, prove that Ow € Ap if and 
only if w € An-1. 
(b) If 10w is a string of 0’s and 1’s of length n, prove that 10w € Ap if and only if w € An_2. 
(c) Prove that |An| = Fn+2 for all n > 1, by induction on n. 


Exercise 0.4.8.1 Prove that every positive integer other than the powers of 2 can be written as 
the sum of two or more consecutive integers. 


Exercise 0.4.9. Prove that (7) ("7") = (*)(®) for any integers n > a > m > 0. 


am 


Exercise 0.4.10.' Suppose that a and b are integers and {rp : n > 0} is the second-order linear 
recurrence sequence given by (0.1.2) with v9 = 0 and x; = 1. 
(a) Prove that for all non-negative integers m we have 


Imtk = lm41L~_ + b&mxL,~ 1 for all integers k > 1. 
(b) Deduce that 


®on41 = es + ba? and fan =%n41%n + b&enXyn_-1 for all natural numbers n. 


Exercise 0.4.11. Suppose that the sequences {z,, : n > 0} and {yn : n > 0} both satisfy (0.1.2) 
and that x9 = 0 and 21 = 1, whereas yo and yi might be anything. Prove that 


Yn = Y1fn + byoxn—1 for all n > 1. 


Exercise 0.4.12. Suppose that x9 = 0, v1 = 1, and &n+42 = axy41 + bry. Prove that for all 
n> 1 we have 

(a) (a+ b—1) 3%) @j = anqi + ban — 1; 

(b) a(ba2 + br ~1a? +--+ 4 ba _, +22) = anen4t; 


n—1 n 
(c) 22, —an—12n41 = (—0) 


n-1 


Exercise 0.4.13. Suppose that tn+2 = a%n+1 + bn for all n > 0. 


(a) Show that 
n 
Sage att \ (e 8 i; ae for alln > 0. 
In41 In 1 0 1 xO 
2 


(b) Deduce that tn42%n — 2 = c(—b)” for all n > 0 where ¢ := wea9 — v7. 


(c) Deduce that x2, — aan41¢n — bx? = —c(—b)”. 
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Other number-theoretic sequences can be obtained from linear recurrences or 
other types of recurrences. Besides the Fibonacci numbers, there is another se- 
quence of integers that is traditionally denoted by (Fn)n>0: These are the Fermat 
numbers, F,, = 2?" +1 for all n > 0 (see sections B11] of appendix 3A, [5.1] 5.25 of 
appendix 5H, etc.). 


Exercise 0.4.14. Show that if Fo = 3 and Fy41 = F? — 2F, +2, then Fy, = 22” 4.1 for all 
n> 0. 


Exercise 0.4.15. (a) Show that if Mp = 0, Mi = 1, and My42 = 3Mn+41—2Mn for all integers 
n> 0, then M, = 2”—1 for all integers n > 0. The integer M,, is the nth Mersenne number 
(see exercise [2.5.16] and sections [4.2] [5-1] etc.). 
(b) Show that if Mo = 0 with Mn41 = 2Mn +1 for alln > 0, then My, = 2” — 1. 


Exercise 0.4.16. We can reinterpret exercise [0.4.3] as giving a recurrence relation for the se- 
quence {F2}n>0, where F, is the nth Fibonacci number; that is, 


Fo. =2F?,.4+2F2,,— F? for alln>0. 


Here F?., is described in terms of the last three terms of the sequence; this is called a linear 
recurrence of order 3. Prove that for any integer k > 1, the sequence {FP }n>0 satisfies a linear 
recurrence of order k + 1. 


How to proceed through this book. It can be challenging to decide what 
proof technique to try on a given question. There is no simple guide—practice is 
what best helps decide how to proceed. Some students find Zeitz’s book 
helpful as it exhibits all of the important techniques in context. I like Conway and 
Guy’s since it has lots of great questions, beautifully discussed with great 
illustrations, and introduces quite a few of the topics from this book. 


A paper that questions one’s assumptions is 


[1] Richard K. Guy The strong law of small numbers, Amer. Math. Monthly, 95 (1988), 697-712. 


Appendices. The short version of this book will offer an appendix at the end 
of most chapters. Sometimes this will add a little more insight or will present a 
proof that is a little more difficult than what is normal for this course. The long 
version of the book will include many appendices after each chapter, highlighting 
directions one might use to develop the material for that chapter. For example, the 
extended version of chapter 0 contains the following appendices: 


Appendix 0A. A closed formula for sums of powers develops the ideas of section 
[(0.2]to obtain a closed formula for the sum of n* for all positive integers n < N. 

Appendix OB. Generating functions, which gives a more elegant proof of the 
closed formula for sums of kth powers using Bernoulli numbers and then discusses 
the generating function for Fibonacci numbers and other recurrence sequences. 


Appendix 0C. Finding roots of polynomials shows how to determine the roots 
of cubic and quartic polynomials and discusses surds. 


Appendix 0D. What is a group? introduces the notion of a group and looks in 
detail at the commutativity of 2-by-2 matrices. 


Appendix OE. Rings and fields explains the point of developing these notions 
in number-theoretic settings and goes on to define and study algebraic numbers. 
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Appendix OF. Symmetric polynomials explains and sketches the proof of New- 
ton’s fundamental theorem of symmetric polynomials, which is the elementary way 
mathematicians used to obtain information about properties of fixed fields before 
Galois invented Galois theory! It allows one to further develop algebraic numbers 
and number fields. 


Appendix 0G. Constructibility introduces the ancient Greek questions of draw- 
ing a square that has area equal to that of a given circle, constructing a cube that 
has twice the volume of a given cube, and constructing an angle which is one third 
the size of a given angle, explaining what these questions have to do with construct- 
ing number fields. 


Chapter 1 


The Euclidean algorithm 


1.1. Finding the gcd 


Most readers will know the Euclidean algorithm, used to find the greatest common 
divisor (gcd) of two given integers. For example, to determine the greatest common 
divisor of 85 and 48, we begin by subtracting the smaller from the larger, 48 from 
85, to obtain 85 — 48 = 37. Now gcd(85, 48) = gcd(48, 37), because the common 
divisors of 48 and 37 are precisely the same as those of 85 and 48, and so we apply 
the algorithm again to the pair 48 and 37. So we subtract the smaller from the 
larger to obtain 48 — 37 = 11, so that gcd(48, 37) = gcd(37,11). Next we should 
subtract 11 from 37, but then we would only do so again, and a third time, so let’s 
do all that in one go and take 37 — 3 x 11 = 4, to obtain gced(37, 11) = ged(11, 4). 
Similarly we take 11 — 2 x 4 = 3, and then 4— 3 = 1, so that the gcd of 85 and 48 
is 1. This is the Euclidean algorithm that you might already have seen[] but did 
you ever prove that it really works? 


To do so, we will first carefully define terms that we have implicitly used in the 
above paragraph, perhaps mathematical terms that you have used for years (such 
as “divides”, “quotient”, and “remainder” ) without a formal definition. This may 
seem pedantic but the goal is to make sure that the rules of basic arithmetic are 
really established on a sound footing. 


Let a and b be given integers. We say that a is divisible by b, or that b divides a 
if there exists an integer gq such that a = qb. For convenience we write “0 | a” EE 
We now set an exercise for the reader to check that the definition allows one to 
manipulate the notion of division in several familiar ways. 


Exercise 1.1.1. In this question, and throughout, we assume that a, b, and c are integers. 
(a) Prove that if b divides a, then either a = 0 or |a| > |b]. 


1There will be a formal discussion of the Euclidean algorithm in appendix 1A. 

?One can also say a is a multiple of b or b is a divisor of a or b is a factor of a. 

3 And if b does not divide a, we write “b{ a”. 

4One reason for giving a precise mathematical definition for division is that it allows us to better 
decide how to interpret questions like, “What is 1 divided by 0?” or “What is 0 divided by 0?” 
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(b) Deduce that if alb and bla, then 6 = a or b = —a (which, in future, we will write as 
"0 — ta”). 
) Prove that if a divides b and c, then a divides bx 4+ cy for all integers 2, y. 
) Prove that a divides b if and only if a divides —b if and only if —a divides b. 
(e) Prove that if a divides b, and b divides c, then a divides c. 
(f) Prove that if a € 0 and ac divides ab, then c divides b. 


Next we formalize the notion of “dividing with remainder”. 


Lemma 1.1.1. Jf a and b are integers, with b > 1, then there exist unique integers 
qandr, withO<r<b—-1, such thata=qb+r. We callgq the “quotient”, and r 
the “remainder”. 


Proof by induction. We begin by proving the existence of q and r. For each 
b > 1, we proceed by induction on a > 0. If 0 < a < b—1, then the result follows 
with g = 0 and r =a. Otherwise assume that the result holds for 0,1,2,...,a@—1, 
where a > b. Then a—1>a-—6> 050, by the induction hypothesis, there exist 
integers Q and r, with 0 < r < b—1, for which a—b = Qb+r. Therefore a = qb+r 
withq=Q+1. 

If a < 0, then —a > 0 so we have —a = Qb+ R, for some integers Q@ and R, 
with 0 < R < b—1, by the previous paragraph. If R = 0, then a = qb where 
q = —Q (and r = 0). Otherwise 1 < R<b—1andsoa=qb+r with ¢g=—-Q-1 
and1<r=b—R< b—1, as required. 

Now we show that q and r are unique. If a= qgb+r = Qb+ R, then b divides 
(q—Q)b= R-r. However 0<7r,R<b—1s0 that |R-—r| <b-—1,andb| R-r. 
Therefore R — r = 0 by exercise [LL ia), and so Q — q = 0. In other words q = Q 
and r = R; that is, the pair q,r is unique. 


An easier, but less intuitive, proof. We can add a multiple of b to a to get a 
positive integer. That is, there exists an integer n such that a+ nb > 0; any integer 
n > —a/b will do. We now subtract multiples of b from this number, as long as it 
remains positive, until subtracting b once more would make it negative. In other 
words we now have an integer a—qb > 0, which we denote by r, such that r—b < 0; 
in other words 0< r < 6-1. 


Exercise 1.1.2. Suppose that a > 1 and 6 > 2 are integers. Show that we can write a in base b; 
that is, show that there exist integers ao,a1,... € [0, b—1] for which a = agb¢+ag_1b¢—!+a1b+a0. 


We say that d is a common divisor of integers a and b if d divides both a and 
b. We are mostly interested in the greatest common divisor of a and b, which we 
denote by gcd(a, b), or more simply as (a, b) BG 

We say that a is coprime with }, or that a and b are coprime integers, or that 
a and 0 are relatively prime, if (a,b) = 1. 


°In the UK this is known as the highest common factor of a and b and is written hcf(a, 6). 

®When a = b = 0, every integer is a divisor of 0, so there is no greatest divisor, and therefore 
gcd(0,0) is undefined. There are often one or two cases in which a generally useful mathematical 
definition does not give a unique value. Another example is 0 divided by 0, which we explore in exercise 
For aesthetic reasons, some authors choose to assign a value which is consistent with the theory 
in one situation but perhaps not in another. This can lead to artificial inconsistencies which is why we 
choose to leave such function-values undefined. 
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Corollary 1.1.1. [fa=qb+r where a, b, q, andr are integers, then 
gcd(a, b) = gced(b,r). 


Proof. Let g = gcd(a,b) and h = gcd(r,b). Now g divides both a and b, so g 
divides a — qb = r (by exercise c)). Therefore g is a common divisor of both r 
and 6, and therefore g < h. Similarly h divides both b and r, so h divides qgb+-r =a 
and hence h is a common divisor of both a and b, and therefore h < g. We have 
shown that g < h and h < g, which together imply that g =h. 


Corollary [LT-J] justifies the method used to determine the ged of 85 and 48 in 
the first paragraph of section [LJ] and indeed in general: 
Exercise 1.1.3. Use Corollary [-1.i] to prove that the Euclidean algorithm indeed yields the 


greatest common divisor of two given integers. (You might prove this by induction on the smallest 
of the two integers.) 


Exercise 1.1.4. Prove that (Fn, fn+1) =1 by induction on n > 0. 


1.2. Linear combinations 


The Euclidean algorithm can also be used to determine a linear combination] of 
a and b, over the integers, which equals gcd(a, b); that is, one can always use the 
Euclidean algorithm to find integers u and v such that 

(1.2.1) au + bv = gcd(a, b). 

Let us see how to do this in an example, by finding integers u and v such that 
85u + 48u = 1; remember that we found the gcd of 85 and 48 at the beginning of 
section [L.1] We retrace the steps of the Euclidean algorithm, but in reverse: The 
final step was that 1 = 1-4—1-8, a linear combination of 4 and 3. The second to 
last step used that 3 = 11—2-4, and so substituting 11—2-4 for 3in 1 =1-4—-1-3, 
we obtain 

1=1-4-1-3=1-4-1-(11-—2-4)=3-4-1-11, 
a linear combination of 11 and 4. This then implies, since we had 4 = 37 — 3-11, 
that 
1=3-(37-—3-11)—-1-11=3-37-10-11, 
a linear combination of 37 and 11. Continuing in this way, we successively deduce, 
using that 11 = 48 — 37 and then that 37 = 85 — 48, 


1= 3-37-—10- (48 — 37) =13-37-— 10-48 
= 13- (85 — 48) — 10-48 = 13-85 — 23 - 48; 
that is, we have the desired linear combination of 85 and 48. 


To prove that this method always works, we will use Lemma [LILI] again: Sup- 
pose that a = qb+r so that gced(a,b) = ged(b,r) by Corollary [LILI] and that we 
have bu — ru = gcd(b,r) for some integers u and v. Then 


(1.2.2) gcd(a, b) = gcd(b, rr) = bu — rv = bu — (a — qh)v = b(u + qu) — av, 


” A linear combination of two given integers a and b, over the integers, is a number of the form axz+by 
where x and y are integers. This can be generalized to yield a linear combination a,%1 +---+4n2n 
of any finite set of integers, a1,..., a,. Linear combinations are a key concept in linear algebra and 
appear (without necessarily being called that) in many courses. 
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the desired linear combination of a and b. This argument forms the basis of our 
proof of (1.2.1), but to give a complete proof we proceed by induction on the smaller 
of a and b: 


Theorem 1.1. Jf a and b are positive integers, then there exist integers u and v 
such that 
au + by = gcd(a, b). 


Proof. Interchanging a and b if necessary we may assume that a > b > 1. We shall 
prove the result by induction on b. If b= 1, then 6 only has the divisor 1, so that 


gcd(a, 1) =1=0-a+1-1. 
We now prove the result for b > 1: If b divides a, then 
gcd(b,a) =b=0-a+4+1-b. 


Otherwise b does not divide a and so Lemma implies that there exist integers 
q and r such that a= qgb+rand1<r< 6-1. Since 1 <r < 6 we know, by the 
induction hypothesis, that there exist integers u and v for which bu—rv = gcd(b,r). 
The result then follows by (L.2.2). 


We now establish various useful properties of the gcd: 


Exercise 1.2.1. (a) Prove that if d divides both a and 6, then d divides gcd(a, b). 
(b) Deduce that d divides both a and b if and only if d divides gcd(a, b). 
(c) Prove that 1 < gcd(a, b) < |a| and [6]. 
(d) Prove that gcd(a, b) = |a| if and only if a divides b. 
Exercise 1.2.2. Suppose that a divides m, and b divides n. 
(a) Deduce that gcd(a, b) divides gcd(m,n). 
(b) Deduce that if ged(m,n) = 1, then gced(a, b) = 1. 
Exercise 1.2.3. Show that Theorem[L.JJ holds for any integers a and b that are not both 0. (It 
is currently stated and proved only for positive integers a and b.) 


Corollary 1.2.1. [fa and b are integers for which gcd(a,b) = 1, then there exist 
integers u and v such that 
au + bv = 1. 


This is one of the most useful results in mathematics and has applications in 
many areas, including in safeguarding today’s global communications. For example, 
we will see in section [0.3] that to implement RSA, a key cryptographic protocol 
that helps keep important messages safe in our electronic world, one uses Corollary 
[1.2.1]in an essential way. More on that later, after developing more basic number 
theory. 


Exercise 1.2.4. (a) Use exercise 1.1.1(c) to show that if au + bv = 1, then (a,b) = (u,v) = 1. 
(b) Prove that gcd(u,v) = 1 in Theorem 


Corollary 1.2.2. If gcd(a,m) = gcd(b,m) = 1, then gcd(ab, m) = 1. 


Proof. By Theorem [LJ] there exist integers r,s, u,v such that 


aut mv = br+ms=1. 
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Therefore 


ab(ur) + m(bur + aus + msv) = (au+mv)(br + ms) = 1, 


and the result follows from exercise [I.2.4{a). 


Corollary 1.2.3. We have gcd(ma, mb) = m- gced(a,b) for all integers m > 1. 


Proof. By Theorem[L.Ithere exist integers u,v such that au+bv = gcd(a, b). Now 
gcd(ma, mb) divides ma and mb so it divides mau + mbuv = m- gcd(a, b). Similarly 
gcd(a,b) divides a and b, so that m- gcd(a,b) divides ma and mb, and therefore 
gcd(ma, mb) by exercise [L.2.-1(a). The result follows from exercise [LT.i{b), since 
the gcd is always positive. 


Exercise 1.2.5. (a) Show that if A and B are given integers, not both 0, with g = gcd(A, B), 
then gcd(A/g, B/g) = 1. 
(b) Prove that any rational number w/v where u,v € Z with v 4 0 may be written as r/s where 
r and s are coprime integers with s > 0. This is called a reduced fraction. 


1.3. The set of linear combinations of two integers 


Theorem states that the greatest common divisor of two integers is a linear 
combination of those two integers. This suggests that it might be useful to study 
the set of linear combinations 


I(a,b) := {am + bn: m,n e€ Z} 


of two given integers a and ol We see that I(a,b) contains 0, a, b, a+b, a+ 
2b, 2b+ a, a—b, b—a,... and any sum of integer multiples of a and b, so that 
I(a,b) is closed under addition. Let I(a) := I(a,0) = {am: m € Z}, the set of 
integer multiples of a. We now prove that I(a,b) can be described as the set of 
integer multiples of ged(a, b), a set which is easier to understand: 


Corollary 1.3.1. For any given non-zero integers a and b, we have 
{am+bn: m,ne€ Z}={gk: ke Z} 


where g := gcd(a, b); that is, I(a,b) = I(g). In other words, there exist integers m 
and n with am + bn = c if and only if gcd(a, b) divides c. 


Proof. By Theorem [I.1] we know that there exist u,v € Z for which au + bu = g. 
Therefore a(uk)+6(vk) = gk so that gk € I(a, b) for allk € Z; that is, I(g) C I(a,}). 
On the other hand, as g divides both a and 8, there exist integers A, B such that 
a = gA, b = gB, and so any am+bn = g(Am+ Bn) € I(g). That is, I(a,b) C I(g). 
The result now follows from the two inclusions. 


It is instructive to see how this result follows directly from the Euclidean algo- 
rithm: In our example, we are interested in gcd(85, 48), so we will study I(85, 48), 
that is, the set of integers of the form 


85m + 48n. 


5 This is usually called the ideal generated by a and 6 in Z and denoted by (a,b)z. The notion of 
an ideal is one of the basic tools of modern algebra, as we will discuss in appendix 3D. 
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The first step in the Euclidean algorithm was to write 85 = 1-48+37. Substituting 
this in above yields 


85m + 48n = (1-48 + 37)m + 48n = 48(m + n) + 37m, 
and so (85,48) C I(48, 37). In the other direction, any integer in [(48,37) can be 
written as 
48a + 37b = 48a + (85 — 48)b = 85b + 48(a — b), 
and so belongs to (85,48). Combining these last two statements yields that 
1(85,48) = I(48, 37). 

Each step of the Euclidean algorithm leads to a similar equality, and so we get 

1(85, 48) = 1(48, 37) = 1(37,11) = (11,4) = 7(4,3) = 7(3, 1) = 7(1,0) = 7(1). 
To truly justify this we need to establish an analogous result to Corollary [L.1.1} 
Lemma 1.3.1. Ifa = qb+r where a, b, q, andr are integers, then I(a,b) = I(b,r). 


Proof. We begin by noting that 
am + bn = (qgb+r)m+ bn = (gm +n) +rm 
so that I(a,b) C I(b,r). In the other direction 
bu + rv = bu + (a — qb)v = av + b(u — qu) 


so that I(b,r) C I(a,b). The result follows by combining the two inclusions. 


We have used the Euclidean algorithm to find the gcd of any two given integers 
a and 0, as well as to determine integers u and v for which au + bv = gcd(a,b). 
The price for obtaining the actual values of u and v, rather than merely proving 
the existence of u and v (which is all that was claimed in Theorem [LJ), was our 
somewhat complicated analysis of the Euclidean algorithm. However, if we only 
wish to prove that such integers u and v exist, then we can do so with a somewhat 
easier proof: 


Non-constructive proof of Theorem [1.1] Let h be the smallest positive inte- 
ger that belongs to I(a,b), say h = au+ bv. Then g := gcd(a,b) divides h, as g 
divides both a and 6. 


Now a=a-1+0-0so0 that a € I(a,b), and 1 < h < a by the definition of h. 
Therefore Lemmaf[i.1.J]implies that there exist integers q andr, withO <r <h-1, 
for which a = qh +r. Therefore 


r=a-—qh=a-—q(au+ bv) = a(1 — qu) + b(—qv) € I(a,b), 


which contradicts the minimality of h, unless r = 0; that is, h divides a. An 
analogous argument reveals that h divides b, and so h divides g by exercise[L.2.1a). 


°We will now prove the existence of u and v by showing that their non-existence would lead to a 
contradiction. We will develop other instances, as we proceed, of both constructive and non-constructive 
proofs of important theorems. 

Which type of proof is preferable? This is somewhat a matter of taste. The non-constructive proof 
is often shorter and more elegant. The constructive proof, on the other hand, is practical—that is, it 
gives solutions. It is also “richer” in that it develops more than is (immediately) needed, though some 
might say that these extras are irrelevant. 

Which type of proof has the greatest clarity? That depends on the algorithm devised for the con- 
structive proof. A compact algorithm will often cast light on the subject. But a cumbersome one may 
obscure it. In this case, the Euclidean algorithm is remarkably simple and efficient ({Shas5] p. 11)). 
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Hence g divides h, and h divides g, and g and h are both positive, so that g = h 
as desired. 


We say that the integers a, b, and c are relatively prime if gcd(a, b,c) = 1. We 
say that they are pairwise coprime if gcd(a,b) = gcd(a,c) = gcd(b,c) = 1. For 
example, 6,10,15 are relatively prime, but they are not pairwise coprime (since 
each pair of integers has a common factor > 1). 

Exercise 1.3.1. Suppose that a, b, and c are non-zero integers for which a+ b= c. 
(a) Show that a,b,c are relatively prime if and only if they are pairwise coprime. 
(b) Show that (a,b) = (a,c) = (b,c). 
(c) Show that the analogy to (a) is false for integer solutions a,b, c,d to a+b =c+4d (perhaps 
by constructing a counterexample). 


1.4. The least common multiple 


The least common multipl¢) of two given integers a and 6 is defined to be the 
smallest positive integer that is a multiple of both a and b. We denote this by 
Icmfa, b] (or simply [a, b]). We now prove the counterpart to exercise [L.2. Ifa): 


Lemma 1.4.1. lcm|{a, b] divides integer m if and only if a and b both divide m. 


Proof. Since a and 6 divide lcm[a, }], if lem[a, 6] divides m, then a and b both 
divide m, by exercise [I.L. Ife). 

On the other hand suppose a and b both divide m, and write m = qlcm|a, b] +r 
where 0 < r < Icmla,b]. Now a and b both divide m and Icm[a, }] so they both 
divide m — qlcm[a, b] = r. However lcm[a, b] is defined to be the smallest positive 
integer that is divisible by both a and b, which implies that r must be 0. Therefore 
lcm[a, b] divides m. 


The analogies to exercise[I.2.1(d) and Corollary [1.2.3] for lems are given by the 
following two exercises: 


Exercise 1.4.1. Prove that lem[m,n] =n if and only if m divides n. 


Exercise 1.4.2. Prove that lem[ma, mb] = m- lcm[a, }] for any positive integer m. 


1.5. Continued fractions 


Another way to write Lemma is that for any given integers a > b > 1 with 
b{a, there exist integers g and r, with b > r > 1, for which 
r 


1 


r 


a 

b 
This is admittedly a strange way to write things, but repeating this process with 
the pair of integers b and r, and then again, will eventually lead us to an interesting 


representation of the original fraction a/b. Working with our original example, in 
which we found the gcd of 85 and 48, we can represent 85 = 48 + 37 as 


ee 
48 3 


1°Sometimes called the lowest common multiple. 
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and the next step, 48 = 37+ 11, as 


48 1 85 1 1 
ag = 1+, sothat [~=1+qm=lt+ rT: 
37 ea 48 37 1+ ca 
The remaining steps of the Euclidean algorithm may be rewritten as 
37 1 11 1 4 1 
i. am ie and gaits 
so that 
85 ia 1 
480 0° 14>4,—" 
ag ee 
3 


This is the continued fraction for $8 and is conveniently written as [1, 1,3, 2, 1,3]. 


Notice that this is the sequence of quotients a; from the various divisions; that is, 


a L 
5 = [ao @1, G2, ..-, Ag) := oo i . 
ul a a 
OO rel 


The a;’s are called the partial quotients of the continued fraction. 


Exercise 1.5.1. (a) Show that if a, > 1, then [ao, a1,..., ax] =[ao, a1,..., ax — 1,1]. 
(b) Prove that the set of positive rational numbers are in 1-1 correspondence with the finite 
length continued fractions that do not end in 1. 


We now list the rationals that correspond to the first few entries in our contin- 
ued fraction [1,1,3,2,1,3]. We have [1] = 1, [1,1] = 2, and 
1 7 1 16 1 23 
i+ =i 1+—,=7 1+ =—. 
1+3 4 1+=+ 9 l¢+sor «= BB 


Oar E 


These yield increasingly good approximations to 85/48 = 1.770833..., that is, in 
decimal notation, 
1, 2, 1.75, 1.777..., 1.7692.... 


We call these p;/q;, 7 => 1, the convergents for the continued fraction, defined by 


D; 
a [a0, G1, 42, +++, a], 

Qj 
since they converge to a/b = pz/q, for some k. Do you notice anything surprising 
about the convergents for 85/48? In particular the previous one, namely 23/13? 
When we worked through the Euclidean algorithm we found that 13-85— 23-48 = 1 
— could it be a coincidence that these same numbers show up again in this new 
context? In section [1.8] of appendix 1A we show that this is no coincidence; indeed 
we always have 

“fhe ; es j-1 
393-1 — Pj-1G = (-1)"~, 


so, in general, if wu = (—1)*~!q,_1 and v = (—1)*px_1, then 


au+ bv = 1. 


When one studies this in detail, one finds that the continued fraction is really 
just a convenient reworking of the Euclidean algorithm (as we explained it above) 
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for finding u and v. Bachet de Meziriad4] introduced this method to Renaissance 
mathematicians in the second edition of his brilliantly named book Pleasant and 
delectable problems which are made from numbers (1624). Such methods had been 
known from ancient times, certainly to the Indian scholar Aryabhata in 499 A.D., 
probably to Archimedes in Syracuse (Greece) in 250 B.C., and possibly to the 
Babylonians as far back as 1700 B.C 


1.6. Tiling a rectangle with squared" 


Given a 48-by-85 rectangle we will tile it, greedily, with squares. The largest square 
that we can place inside a 48-by-85 rectangle is a 48-by-48 square. This 48-by-48 
square goes from top to bottom of the rectangle, and if we place it at the far right, 
then we are left with a 37-by-48 rectangle to tile, on the left. 


85 


Figure 1.1. Partitioning a rectangle into squares, using the Euclidean algorithm. 


If we place a 37-by-37 square at the top of this rectangle, then we are left with an 
11-by-37 rectangle in the bottom left-hand corner. We can now place three 11-by-11 
squares inside this, leaving a 4-by-11 rectangle. We finish this off with two 4-by-4 
squares, one 3-by-3 square, and finally three 1-by-1 squares. 


11The celebrated editor and commentator on Diophantus, whom we will meet again in chapter 6. 

12There are Cuneiform clay tablets from this era that contain related calculations. It is known 
that after conquering Babylon in 331 B.C., Alexander the Great ordered his archivist Callisthenes and 
his tutor Aristotle to supervise the translation of the Babylonian astronomical records into Greek. It is 
therefore feasible that Archimedes was introduced to these ideas from this source. Indeed, Pythagoras’s 
Theorem may be misnamed as the Babylonians knew of integer-sided right-angled triangles like 3, 4,5 
and 5,12,13 more than one thousand years before Pythagoras (570-495 B.C.) was born. 

13Thanks to Dusa MacDuff and Dylan Thurston for bringing my attention to this beautiful 
application. 
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The area of the rectangle can be computed in terms of the areas of each of the 
squares; that is, 


85-48 = 1-487 +1-3774+3-117 42-47 41-37 43-17. 


What has this to do with the Euclidean algorithm? Hopefully the reader has 
recognized the same sequence of numbers and quotients that appeared above, when 
we computed the gcd(85,48). This is no coincidence. At a given step we have an 
a-by-b rectangle, with a > b > 1, and then we can remove q b-by-b squares, where 
a = qb+r with 0 < r < a—1 leaving an r-by-b rectangle, and so proceed by 
induction. 


Exercise 1.6.1. Given an a-by-b rectangle show how to write a-b as a sum of squares, as above, 
in terms of the partial quotients and convergents of the continued fraction for a/b. 


Exercise 1.6.2. (a) Use this to show that Fn4ifFn = F? } F2_. fe Fe, where F;, is 
the nth Fibonacci number (see section 0.1 for the definition and a discussion of Fibonacci 
numbers and exercise[0.4.12{b) for a generalization of this exercise). 


(b)? Find the correct generalization to more general second-order linear recurrence sequences. 


Additional exercises 


Exercise 1.7.1. (a) Does 0 divide 0? (Use the definition of “divides” .) 
(b) Show that there is no unique meaning to 0/0. 
(c) Prove that if b divides a and b ¥ 0, then there is a unique meaning to a/b. 


Exercise 1.7.2. Prove that if a and b are not both 0, then gcd(a, b) is a positive integer. 


Exercise 1.7.3.1 Prove that if m and n are coprime positive integers, then {nent is an 


integer. 


Exercise 1.7.4. Suppose that a= qb+r withO<r<b—-1. 

(a) Let [t] be the integer part of t, that is, the largest integer < t. Prove that q = [a/b]. 

(b) Let {t} to be the fractional part of t, that is, {t} = t— [t]. Prove that r = b{r/b} = b{a/b}. 
(Beware of these functions applied to negative numbers: e.g., [—3.14] = —4 not —3, and {—3.14} = 
.86 not .14.) 


Exercise 1.7.5.1 (a) Show that if n is an integer, then {n + a} = {a} and [n +a] = n+ [a] 

for alla € R. 

(b) Prove that [a + 8] — [a] — [8] = 0 or 1 for all a, 8 € R, and explain when each case occurs. 

(c) Deduce that {a} + {8} — {a+ 8} = 0 or 1 for all a,8 € R, and explain when each case 

occurs. 

(d) Show that {a} + {—a} =1 unless a is an integer in which case it equals 0. 
(e) Show that ifa € Zand r € R\Z, then [r]+ [a-—r] =a—-1. 


Exercise 1.7.6. Suppose that d is a positive integer and that N,ax > 0. 
(a) Show that there are exactly [x] positive integers < a. 
(b) Show that kd is the largest multiple of d that is < N, where k = [N/d]. 
(c) Deduce that there are exactly [N/d] positive integers n < N which are divisible by d. 


Exercise 1.7.7. Prove that Sar [a+ £] = [na] for any real number a and integer n > 1. 


Exercise 1.7.8. Suppose that a+b =c and let g = gcd(a,b). Prove that we can write a = gA, 
b = gB, and c= gC where A+ B =C, where A, B, and C are pairwise coprime integers. 


Exercise 1.7.9. Prove that if (a,b) = 1, then (a+ b,a— 6) = 1 or 2. 


Exercise 1.7.10. Prove that for any given integers b > a > 1 there exists an integer solution 
u,w to au — bw = gcd(a,b) withO<u<b-—landO0O<w<a-l1. 


Questions on the Euclidean algorithm 21 


Exercise 1.7.11.' Show that if gcd(a,b) = 1, then gcd(a”, b®) = 1 for all integers k, > 1. 


Exercise 1.7.12. Let m and n be positive integers. What fractions do the two lists 4, og et 


and i, or wot have in common (when the fractions are reduced)? 


2 
> mort) 
are put in increasing order, what is the shortest distance between two consecutive 


Exercise 1.7.13. Suppose m and n are coprime positive integers. When the fractions a 
m-1 1 n-1 


m ? n? PD n 
fractions? 


Given a 7-liter jug and a 5-liter jug one can measure 1 liter of water as follows: 
Fill the 5-liter jug, and pour the contents into the 7-liter jug. Fill the 5-liter jug 
again, use this to fill the 7-liter jug, so we are left with 3 liters in the 5-liter jug 
and the 7-liter jug is full. Empty the 7-liter jug, pour the contents of the 5-liter jug 
into the 7-liter jug, and refill the 5-liter jug. We now have 3 liters in the 7-liter jug. 
Fill the 7-liter jug using the 5-liter jug; we have poured 4 liters from the 5-liter jug 
into the 7-liter jug, so that there is just 1 liter left in the 5-liter jug! Notice that 
we filled the 5-liter jug 3 times and emptied the 7-liter jug twice, and so we used 
here that 3 x 5—2 x 7=1. We have wasted 2 x 7 liters of water in this process. 


Exercise 1.7.14. (a) Since 3x 7—4 x 5 = 1 describe how we can proceed by filling the 7-liter 
jug each time rather than filling the 5-liter jug. 
(b) Can you measure 1 liter of water using a 25-liter jug and a 17-liter jug? 
(c)t Prove that if m and n are positive coprime integers then you can measure one liter of water 
using an m liter jug and an n liter jug? 
(d) Prove that one can do this wasting less than mn liters of water. 


Exercise 1.7.15. Can you weigh 1 lb of tea using scales with 25-lb and 17-lb weights? 


The definition of a set of linear combinations can be extended to an arbitrary 
set of integers (in place of the set {a,b}); that is, 


I(ay,,...,@%) = {aym, + agm2g +--+ + apmy,: M1,Me,...,M~ € Z}. 


Exercise 1.7.16. Show that I(a1,...,a,%) = I(g) for any non-zero integers a1,...,@%, where we 
have g = gcd(a1,..., ax). 


Exercise 1.7.17.1 Deduce that if we are given integers a1,a2,...,a@,%, not all zero, then there 
exist integers m1,m2,..., Mx such that 

mai + m2a2+---+ mpag = ged(ai,a2,..., ax). 
We say that the integers a1,a2,...,@, are relatively prime if gcd(a1,a2,...,a,%) = 1. We say that 


they are pairwise coprime if gcd(a;,a;) = 1 whenever i # j. Note that 6,10,15 are relatively 
prime, but not pairwise coprime (since each pair of integers has a common factor > 1). 
Exercise 1.7.18. Prove that if g = gcd(a1,a2,...,a,), then gced(a1/g,a2/g,...,ax/g) = 1. 
Exercise 1.7.19.1 (a) Prove that abc = [a, b, c] - gcd (ab, be, ca). 


(b)? Prove that if r+ s =n, then 


@1+++@n =|lem [[@: Ic {l,...,n}, |Z] =r] - ged Ila: Jc {1,...,n}, |J) =s 
iel jes 


22 1. The Euclidean algorithm 


Throughout this book we will present more challenging exercises in the final 
part of each chapter. If some of the questions are part of a consistent subject, then 
they will be presented as a separate subsection: 


Divisors in recurrence sequences 


We begin by noting that for any integer d > 1 we have the polynomial identity 
(1.7.1) ot _ yt = (a _— y)(at4 + gi 2y a oa ay? + ye) 

Hence if r and s are integers, then r — s divides r? — s¢. (This also follows from 
Corollary 2.3.1] in the next chapter.) 


Exercise 1.7.20. (a) Prove that if m|n, then 2” — 1 divides 2” — 1. 
b)? Prove that ifn = qm+r with 0 <r <m-—1, then there exists an integer Q such that 
q s' 
27 -1=Q(2™—1)+(2"—1) (and note that 0< 2"-1<2™-—1). 
(c)t Use the Euclidean algorithm to show that ged(2” — 1,27 —1) = 28cd(r™) _ 1, 
(d) What is the value of ged(N® — 1, N° — 1) for arbitrary integer N 4 —1,0, or 1? 


In exercise [0.4.15(a) we saw that the Mersenne numbers M,, = 2" — 1 (of the 
previous exercise) are an example of a second-order linear recurrence sequence. We 
will show that an analogous result holds for any second-order linear recurrence 
sequence that begins 0,1,.... For the rest of this section we assume that a and b 
are coprime integers with x9 = 0, x; = 1 and that z, = ary,_1 + bry_2 for all 
n> 2. 


Exercise 1.7.21. Use exercise[0.4.10{a) to show that gcd(t@m,2%n) = gcd(@m,%m+12n—m) when- 
ever n > ™m. 


Exercise 1.7.22.’ Prove that if m|n, then tm|an; that is, {zn : n > 0} is a division sequence. 


Exercise 1.7.23.' Assume that (a,b) = 1. 

(a) Prove that gcd(an,b) = 1 for all n > 1. 

(b) Prove that gced(an,%n-1) = 1 for alln > 1. 

(c) Prove that ifn > m, then (an,%m) = (fn—m,2Lm)- 
) Deduce that (an, am) = (n,m): 


Exercise 1.7.24.' For any given integer d > 2, let m = mg be the smallest positive integer for 
which d divides 2. Prove that d divides x, if and only if mg divides n. 


It is sometimes possible to reverse the direction in the defining recurrence re- 
lation for a recurrence sequence; that is, if b = 1, then (0.1.2) can be rewritten as 
In-2 = —AXn_1 + Lp. So if xp =O and x; = 1, then r_; = 1,7_2 = —a,.... We 
now try to understand the terms x_». 

Exercise 1.7.25. Let us suppose that a, = atn,—1 + %n—2 for all integers n, both positive and 


negative, with zo = 0 and x1 = 1. Prove, by induction on n > 1, that r_p» = (—1)"—12,y, for all 
ne 2. 


Appendix 1A. Reformulating 
the Euclidean algorithm 


In section [1.5] we saw that the Euclidean algorithm may be usefully reformulated 
in terms of continued fractions. In this appendix we reformulate the Euclidean 
algorithm in two further ways: firstly, in terms of matrix multiplication, which 
makes many of the calculations easier; and secondly, in terms of a dynamical system, 
which will be useful later when we develop similar ideas in a more general context. 


1.8. Euclid matrices and Euclid’s algorithm 


In discussing the Euclidean algorithm we showed that gcd(85,48) = gcd(48, 37) 
from noting that 85 — 1-48 = 37. In this we changed our attention from the 
pair 85,48 to the pair 48,37. Writing this down using matrices, we performed this 


change via the map 
85 ed 48\ /0 1 85 
48 37) \1 —-1/) \48)° 


Next we went from the pair 48,37 to the pair 37,11 via the map 


(7) = (i) =(@ 2) Ge) 


and then from the pair 37,11 to the pair 11,4 via the map 


(i) >) = 5) (x) 


We can compose these maps so that 


(is) > (sr) > Gt) = 4) ) = GA) 4) G) 
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and then 


(is) °C) =G 2) (t= 3)-G A) G 4) (e): 


Continuing on to the end of the Euclidean algorithm, via 11 = 2-4+3,4=1-3-+1, 
and 3 = 3-1+0, we have 


(= G 2)G A) 4)G 4)G 4) G 4) (e): 


Since 6 : ) (; ? =I for any x, we can invert to obtain 


; (3) -(" 
HC IC AIG IG JG JG 0): 


Here we used that the inverse of a product of matrices is the product of the inverses 
of those matrices, in reverse order. If we write 


{2 6 
wn(* 8) 


where a, 3,7, 0 are integers (since the set of integer matrices are closed under mul- 
tiplication), then 


where 


ad — By = det M = (-1)§ = 1, 


since M is the product of six matrices, each of determinant —1, and the determinant 
of the product of matrices equals the product of the determinants. Now 


85\ 1\ fa B\/1)\_ fa 
(is) =" (0) = 5) (0) =) 
so that a = 85 and y = 48. This implies that 


855 — 48 B = 1; 


that is, the matrix method gives us the solution to (L.2.1) without extra effort. 
If we multiply the matrices defining M together in order, we obtain the sequence 
1 1 1 1\/1 1) /2 1 2 1\/3 1\_ /7 2 
1 07’ \1 O/ \1 OF V1 17? \1 17s \1 OF \4 1 
and then 
16 7 23 16 85 23 
9 4)’ \13 9/7’ \48 138/77 


We notice that the columns give us the numerators and denominators of the con- 
vergents of the continued fraction for 85/48, as discussed in section [I.5] 
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We can generalize this discussion to formally explain the Euclidean algorithm: 


Let up = a> uy = b> 1. Given uj; > uj4i = 1: 


Let a; = [u;/u;+41], an integer > 1. 

e Let ujp2 = uz — ajuj41 so that 0 < ujyo < uj4i — 1. 

e If uj+2 =0, then g := ged(a, b) = uj4i, and terminate the algorithm. 
e Otherwise, repeat these steps with the new pair uj+1, uj+2- 


The first two steps work by Lemmaf[I.1.1] the third by exercise [1.3] We end up 
with the continued fraction 


a/b = [ao,a1,---, x 


for some k > 0. The convergents p;/q; = [ao,@1,...,a,;] are most easily calculated 
by matrix arithmetic as 


Pj Pj-1 — {ao 1 ay 1 a; 1 
aa e in - € i) € 7) a € 0 
so that a/g = pz and b/g = q,, where g =gcd(a, b). 


Exercise 1.8.1. Prove that this description of the Euclidean algorithm really works. 


Exercise 1.8.2. (a) Show that pjqj—1 — pj—1gj = (—1)3*? for all j > 0. 
(b) Explain how to use the Euclidean algorithm, along with (1.8.1), to determine, for given 
positive integers a and 6, an integer solution u,v to the equation au + bv = gcd(a, b). 


Exercise 1.8.3. With the notation as above, show that [a,,...,a0] = a/c for some integer c for 
which 0 < c < a and bc = (—1)* (mod a). 


Exercise 1.8.4. Prove that for every n > 1 we have 
Fatt Fn \_ fl 1\" 
Fy Fr-i) \1 oO} ’ 
where F, is the nth Fibonacci number. 
My favorite open question in this area is Zaremba’s conjecture: He conjectured 
that there is an integer B > 1 such that for every integer n > 2 there exists a 
fraction m/n, where m is an integer, 1 < m < n—1, coprime with n, for which 


the continued fraction m/n = [ao,a1,...,@%] has each partial quotient a, < B. 
Calculations suggest one can take B = 5. 


1.9. Euclid matrices and ideal transformations 


In section[L.3]we used Euclid’s algorithm to transform the basis of the ideal [(85, 48) 
to I(48, 37), and so on, until we showed that it equals [(1,0) = I(1). The transfor- 
mation rested on the identity 


85m + 48n = 48m! + 37n’, where m’ =m-+nandn’ =n; 


a transformation we can write as 


(m,n) (msn!) = (mn) ($9) 
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The transformation of linear forms can then be seen by 


48 1 1 48 85 
48m! + 37n! = (m',n’) a — (m,n) é i) (3) = (m,n) ey = 85m + 48n. 


The inverse map can be found simply by inverting the matrix: 


(mi) Cr) = a) Ge): 


These linear transformations can be composed by multiplying the relevant matrices, 
which are the same matrices that arise in the previous section, section [1.8] For 
example, after three steps, the change is 


2 
intemal) (; i) 
so that 11lm3 + 4n3 = 85m + 48n. 


Exercise 1.9.1. (a) With the notation of section establish that xu; + yuj+1 = ma+ nb 
where the variables x and y are obtained from the variables m and n by a linear transfor- 
mation. 

(b) Deduce that I(u;,uj+1) = I(a,b) for 7 =0,...,k. 


1.10. The dynamics of the Euclidean algorithm 


We now explain a dynamical perspective on the Euclidean algorithm, by focusing on 
each individual transformation of the pair of numbers with which we work. In our 
example, we began with the pair of numbers (85,48), subtracted the smaller from 
the larger to get (37,48), and then swapped the order to obtain (48,37). Now we 
begin with the fraction x := 85/48; the first step transforms x > y := x—1 = 37/48, 
and the second transforms y > 1/y = 48/37. The Euclidean algorithm can easily 
be broken down into a series of steps of this form: 

8 37 48 11 37 — 26 15 4 

> > > > > > > 

48 48 37 37 11 11 11 11 

11 7 3 4 1 3 2 1 0 
> > > > > > > >=, 

4 4 4 3 8 1 1 1 1 
It is possible that the map 2 — x — 1 is repeated several times consecutively (for 
example, as we went from 37/11 to 4/11), the number of times corresponding to the 
quotient, [a]. On the other hand, the map y > 1/y is not immediately repeated, 
since repeating this map sends y back to y, which corresponds to swapping the 
order of a pair of numbers twice, sending the pair back to their original order. 


> 


These two linear maps correspond to our matrix transformations: 


1 -l 37 1 —1)\ /85 
x — x —1 corresponds to a 0 ) , so that ce a 6 0 ) f) : 


01 48 OT) (3h 
and y + 1/y corresponds to (; 3) , So that & = (; 3) i : 


The Euclidean algorithm is therefore a series of transformations of the form 7 > 
x —1 andy — 1/y and defines a finite sequence of these transformations that 
begins with any given positive rational number and ends with 0. One can invert 
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that sequence of transformations, to transformations of the form « > «+1 and 
y — 1/y, to begin with 0 and to end at any given rational number. 


Determinant 1 transformations. Foreshadowing later results, it is more useful 
to develop a variant on the Euclidean algorithm in which the matrices of all of the 
transformations have determinant 1. To begin with, we break each transformation 
down into the two steps: 


e Beginning with the pair 85, 48 the first step is to subtract 1 times 48 from 85, 
and in general we subtract q times b from a. This transformation is therefore 
given by 


_ = -4q 
(5) > e a (5) , and notice that ({ i. a € i) : 


e The second step swaps the roles of 37(= 85 — 48) and 48, corresponding to 
a matrix of determinant —1. Here we do something unintuitive which is to 
change 48 to —48, so that the matrix has determinant 1: 


37 = 0 —-1)\ (37 d wy (@ 0 -1)\/a 
48 1 0 4g}? and more generally |,] >|, 9 ae 


One then sees that if g = gcd(a,b) and a/b = [ao,..., ax], then 


G)=G i) Ca) i) G Z)G 4) G): 


We write S := ( i) and T := ( e i) Taking inverses here we get 


il 
0 1 —-1 0 


(5) = S©TS7... 979% (°) 
b g 


If a and b are coprime, then this implies that 


(1.10.1) SOTSMT ... SAT S% = é : 

for some integers c and d. The left-hand side is the product of determinant one 
matrices, and so the right-hand side also has determinant one; that is, cb— ad = 1. 
This is therefore an element of SL(2,Z), the subgroup (under multiplication) of 
2-by-2 integer matrices of determinant one; more specifically 


SL(2, Z) ={( Ae o,f, 7,0 © Z, ad — by=1}. 


Theorem 1.2. Each matrix in SL(2,Z) can be represented as S2 TS... Se TIr 
for integers e1, fi,..-,€r, fr- 


: € SL(2,Z). Taking determinants we 
see that ba —ay = 1. Therefore gcd(a, b) = 1, and so above we saw how to construct 
an element of SL(2,Z) with the same last column. In Theorem [3.5] we will show 


Proof. Suppose that we are given _ 
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that every other integer solution to bx — ay = 1 is given by x = c— ma, y = d— mb 
for some integer m. Therefore 


G )=( 8) Ce 4) 


One can easily verify that 
_1_/0 -1 -1onp_f 1 0 
T eC 0 , so that TST = -1 1)? 


( : ') Cr SE ar Sr: 


—m 1 


and therefore 


Combining these last two statements together with (1.10.1) completes the proof of 
the theorem. 


Appendices. The extended version of chapter 1 has the following additional 
appendices: 

Appendix 1B. Computational aspects of the Euclidean algorithm, which dis- 
cusses how to speed up the Euclidean algorithm, how to determine how long it 
takes, and asks what a “fast” algoirthm is. 

Appendix 1C. Magic squares is a basic introduction to constructing different 
types of magic and Latin squares of arbitrary dimension. 

Appendix 1D. The Frobenius postage stamp problem introduces the question 
of what amounts of postage can be made up of stamps of given costs. 


Appendix 1E. Egyptian fractions discusses what rational numbers are a sum 
of distinct fractions of the form 1/n. 


OOO 
Chapter 2 


Congruences 


The key step in understanding the Euclidean algorithm, Lemma [L1.1) shows that 
gcd(a,b) equals gcd(r,b), because b divides a — r. Inspired by how useful this 
observation is, Gauss developed the theory of when two given integers, like a and 
r, differ by a multiple of b: 


2.1. Basic congruences 
If m, b, and c are integers for which m divides b — c, then we write 
b=c (mod m) 


and say that b and c are congruent modulo m, where m is the modulus[|] The 
numbers involved should be integers, not fractions, and the modulus can be taken 
in absolute value; that is, b = c (mod m) if and only if b = c (mod |ml), by 
definition. 

For example, —10 = 15 (mod 5), and —7 = 15 (mod 11), but —7 # 15 
(mod 3). Note that b= b (mod m) for all integers m and b. 


The integers = a (mod m) are precisely those of the form a+km where k is an 
integer, that is, a,a+m,a+2m,...as well as a—m,a—2m,a—3m,.... We call 
this set of integers a congruence class or residue class mod m, and any particular 
element of the congruence class is a residue[| 


For any given integers a and m > 0, there exists a unique pair of integers q 
and r with 0 < r < m-—1, for which a= qm+r, by Lemma[L.LJ] Therefore there 
exists a unique integer r € {0,1,2,...,m—1} for which a =r (mod m). Moreover, 
if two integers are congruent mod m, then they leave the same remainder, r, when 


1Gauss proposed the symbol = because of the analogies between equality and congruence, which 
we will soon encounter. To avoid ambiguity he made a minor distinction by adding the extra bar. 

?The sequence of numbers a,a + m,a-+ 2m,..., in which we add m to the last number in the 
sequence to obtain the next one, is an arithmetic progression. 
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divided by m. We now prove a generalization of these last remarks: 


Theorem 2.1. Suppose that m is a positive integer. Exactly one of any m consec- 
utive integers is =a (mod m). 


Two proofs} Suppose we have the m consecutive integers z,2+1,...,u-+m-—1. 


Analytic proof: An integer n in the range x < n < «+ mis of the form a+ km, 
for some integer k, if and only if there exists an integer k for which 


xr<atkm<a+m. 


Subtracting a from each term here and dividing through by m, we find that this 
holds if and only if 


L-Ga L-Ga 

<k< 
m m 

Hence k must be an integer from an interval of length one which has just one 

endpoint included in the interval. Such an integer k exists and is unique; it is the 


smallest integer that is > *—*. 


+1. 


Exercise 2.1.1. Prove that for any real number t there is a unique integer in the interval [¢,t+1). 


Number-theoretic proof: By Lemmafi.1.|there exist integers q and r with 0 <r < 
m—1, for which a— x =qm+r, withO<r<m-1. Thenz<24+r<24+m-1 
and «+r =a-—qm=a (mod m), and so x +r is the integer that we are looking 
for. We still need to prove that it is unique: 

If +7 =a (mod m) and «+ j = a (mod m), where 0 <i <j < m—1, 
then i = a— ax = j (mod ™m), so that m divides 7 — 7, which is impossible as 
1<j-i<m-1l. 


Exercise 2.1.2. Prove that m divides (n — 1)(n — 2)---(n — m) for every integer n and every 
integer m > 1. 


Theorem [2.1] implies that any m consecutive integers yield a complete set of 
residues (mod m); that is, every congruence class (mod m) is represented by exactly 
one element of the given set of m integers. For example, every integer has a unique 
residue amongst 


the least non-negative residues (mod m): 0, 1, 2, ...,(m—1), 
as well as amongst 
the least positive residues (mod m): 1, 2, ...,m, 


and also amongst 


the least negative residues (mod m) : (m—1), -(m—-2), ..., —2, -1, 0. 


For example, 2 is the least positive residue of —13 (mod 5), whereas —3 is the least 
negative residue; and 5 is its own least positive residue mod 7, whereas —2 is the 
least negative residue. Notice that if the residue is not = 0 (mod m), then these 
residues occur in pairs, one positive and the other negative, and at least one of each 


3Why give two proofs? Throughout this book we will frequently take the opportunity to give more 
than one proof of a key result. The idea is to highlight different aspects of the theory that are, or will 
become, of interest. Here we find both an analytic proof (meaning that we focus on the size or quantity 
of the objects involved) as well as a number-theoretic proof (in which we use their algebraic properties). 
Sometimes the interplay between these two perspectives can take us much further than either one alone. 
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pair is < m/2 in absolute value. We call this the absolutely least residue (mod m) 
(and we select m/2, rather than —m/2, when m is even)|{‘| For example if m = 5, 
we can pair up the least positive residues and the least negative residues as 


=-—4 (mod 5), 2=-—3 (mod 5), 3=-—2 (mod 5), 4=-—1 (mod 5), 


as well as the exceptional 5 = 0 (mod 5). Hence the absolutely least residues 
(mod 5) are —2,—1,0,1,2. Similarly the the absolutely least residues (mod 6) are 
—2,-1,0,1,2,3. More generally if m = 2k 4+ 1 is odd, then the absolutely least 
residues (mod 2k +1) are —k,...,—1,0,1...,k; and if m = 2k is even, then the 
absolutely least residues (mod 2k) are —(k—1),...,—1,0,1...,k. 


We defined a complete set of residues to be any set of representatives for the 
residue classes mod m, one for each residue class. A reduced set of residues has 
representatives only for the residue classes that are coprime with m. For example 
{0,1,2,3,4,5} is a complete set of residues (mod 6), whereas {1,5} is a reduced set 
of residues, as 0, 2, and 4 are divisible by 2, and 0 and 3 are divisible by 3 and so 
are excluded. 


Exercise 2.1.3. Suppose that a1,...,@m is a complete set of residues mod m. Prove that m 
divides (n — a1)---(n — am) for every integer n. 
Exercise 2.1.4. (a) Explain how “a number of the form 3n — 1” means the same thing as “a 
number of the form 3n + 2”, using the language of congruences. 
(b) Prove that the set of integers in the congruence class a (mod d) can be partitioned into the 
set of integers in the congruence classes a (mod kd), a+d (mod kd),... and a+(k—1)d 
(mod kd). 


Exercise 2.1.5. Show that if a= b (mod m), then (a,m) = (b,m). 
Exercise 2.1.6. Prove that if a= b (mod m), then a= b (mod 4d) for any divisor d of m. 
Exercise 2.1.7. Satisfy yourself that addition and multiplication mod m are commutative] 


Exercise 2.1.8. Prove that the property of congruence modulo m is an equivalence relation on 
the integers. To prove this, one must establish 


(i) a=a (mod m); 
(ii) a = b (mod m) implies b = a (mod m); 
=b 


(iii) a (mod m) and b=c (mod m) imply a=c (mod m). 


The equivalence classes are therefore the congruence classes mod m. 


One consequence of this is that integers that are congruent modulo m have the 
same least residues modulo m, whereas integers that are not congruent modulo m 
have different least residues. 


The main use of congruences is that it simplifies arithmetic when we are looking 
into questions about remainders. This is because the usual rules for addition, 
subtraction, and multiplication work for congruences. However, division is a little 
more complicated, as we will see in the next section. 


*This is often called the least residue in absolute value. 

5 A mathematical operation is commutative if you get the same result no matter what order you take 
the input variables in. Thus, in C, we have x + y= y+a and ry = yx. There are common operations 
that are not commutative; for example a — b # b—a in C, unless a = b. Moreover multiplication 
in different settings might not be commutative, for example when we multiply 2-by-2 matrices, as we 
discovered, in detail, in section [0.12]of appendix OD. 
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Lemma 2.1.1. Ifa=b (mod m) andc=d (mod m), then 


at+c=b+d (modm), 
a—c=b—d (modm), 


and ac=bd (mod m). 
Proof. By hypothesis there exist integers wu and v for which a — b = um and 
c—d=vm. Therefore 
(a+c)— (64+ d) = (a—b)+ (c—d) =um+um=(utvu)m 
so that a+c=b+d (mod m); 
(a —c) — (b—d) = (a—b) — (c— d) =um—vm= (u-v)m 
so that a— c=b-—d (mod m); and 


ac — bd = a(c— d) + d(a— b) =a-um+d-um = (av + du)m 


so that ac = bd (mod m). 


These are the rules of modular arithmetic. 


Exercise 2.1.9. Under the hypothesis of Lemma show that ka+lc = kb+Id (mod m) for 
any integers k and I. 


Exercise 2.1.10. If p|jm and m/p = a (mod q), then prove that m = ap (mod q). 


2.2. The trouble with division 


Although the rules for addition, subtraction, and multiplication work for congru- 
ences as they do for the integers, reals, and most other mathematical objects we 
have encountered, the rule for division is more subtle. In the complex numbers, if 
we are given numbers a and b # 0, then there exists a unique value of c for which 
a = bc (so that c = a/b), and therefore there is no ambiguity in the definition of 
division. We now look at the multiplication tables mod 5 and mod 6 to see whether 
this same property holds for modular arithmetic: 


x 
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The multiplication table (mod 5). 


Other than in the top row, we see that every congruence class mod 5 appears 
exactly once in each row of the table. For example, in the row corresponding to the 
multiples of 2, mod 5 we have 0, 2, 4, 1, 3, which implies that for each a (mod 5) 
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there exists a unique value of c (mod 5) for which a = 2c (mod 5); that is, c= a/2 
(mod 5). We read off 


0/2=0 (mod 5), 1/2=3 (mod 5), 2/2=1 (mod 5), 
3/2=4 (mod 5), and 4/2=2 (mod 5), 


each division leading to a unique value. This is true in each row, so for every non- 
zero value of b (mod 5) and every a (mod 5), there exists a unique multiple of b, 
which equals a mod 5. Therefore division is well- (and uniquely) defined modulo 5. 


However, the multiplication table mod 6 looks rather different. 


x 0 1 y 3 4 5 
0 0 0 0 0 0 0 
1 0 1 2 3 4 5 
2 0 2 4 0 2 4 
3 0 3 0 3 0 3 
4 0 4 2 0 4 2 
5 0 5 4 3 2 1 


The multiplication table (mod 6). 


The row corresponding to the multiples of 5, mod 6, is 0, 5, 4, 3, 2, 1, so that 
each b/5 (mod 6) is well-defined. 


However, the row corresponding to the multiples of 2, mod 6, reads 0, 2, 4, 0, 2, 4. 
There is no solution to 1/2 (mod 6). On the other hand, for something as simple 
as 4/2 (mod 6), there are two different solutions: 5 (mod 6) as well as 2 (mod 6). 
Evidently it is more complicated to understand division mod 6 than mod 5. 


We can obtain a hint of what is going on by applying exercise 2.1.4) which 
implies that the union of the sets of integers in the two arithmetic progressions 5 
(mod 6) and 2 (mod 6) gives exactly the integers = 2 (mod 3). So we now have a 
unique solution to 4/2 (mod 6), albeit a congurence class belonging to a different 
modulus. 


Exercise 2.2.1. Determine one congruence class which gives all solutions to 3 divided by 3 
(mod 6). (In other words, find a congruence class a (mod m) such that 3x = 3 (mod 6) if and 
only if z =a (mod m).) 


These issues with division arise when we try to solve equations by division: If we 
divide each side of 8 = 2 (mod 6) by 2, we obtain the incorrect “4 = 1 (mod 6)”. 
We can correct this by dividing the modulus through by 2 also, so as to obtain 
4 = 1 (mod 3). Even this is not the whole story, for if we wish to divide both 
sides of 21 = 6 (mod 5) through by 3, we cannot also divide the modulus, since 3 
does not divide 5. However, in this case one does not need to divide the modulus 
through by 3, since 7 = 2 (mod 5). So what is the general rule? We shall resolve 
all of these issues in Lemma[B.5.1] after we have developed a little more theory. 
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2.3. Congruences for polynomials 


Let Z[a] denote the set of polynomials with integer coefficients. Using the above 
rules for congruences, one gets a very useful result for congruences involving poly- 
nomials: 


Corollary 2.3.1. If f(x) € Z[z] and a = b (mod m), then f(a) = f(b) (mod m). 


Proof. Since a = b (mod m) we have a? = b? (mod m) by Lemmaf2.1.]] and then 


Exercise 2.3.1. Prove that a’ = b* (mod m) for all integers k > 1, by induction. 


Now, writing f(x) = par fix’ where each f; is an integer, we have 


d 
f(a) =S- fia’ >> fib’ = (mod m), 
i=0 


by Lemma|2.1.1 


This result can be extended to polynomials in many variables. 
Exercise 2.3.2. Deduce, from Corollary that if f(t) € Z[t] and r,s € Z, then r— s divides 
f(r) — f(s). 


Therefore, for any polynomial f(z) € Z[x], the sequence f(0), f(1), f(2),... 
modulo m is periodic of period m; that is, the values repeat every mth term in the 
sequence, repeating indefinitely. More precisely f(n +m) = f(n) (mod m) for all 
integers n. 


Example. If f(x) = x? — 8x + 6 and m = 5, then we get the sequence 
f(0), f(1),... = 1,4,3,4,3,1,4,3,4,3,1... 

and the first five terms 1, 4,3, 4,3 repeat infinitely often. Moreover we get the same 
pattern if we run though the consecutive negative integer values for x. 

Note that in this example f(z) is never 0 or 2 (mod 5). Thus none of the 
equations 

a — 8 +6 =0, y? — 8y+1=0, and 2°—8z+4=0 

can have solutions in integers x, y, or z. 


Exercise 2.3.3. Let f(x) € Z[z]. Suppose that f(r) #0 (mod m) for all integers r in the range 
0<r<m-—1. Deduce that there does not exist an integer n for which f(n) = 0. 


2.4. Tests for divisibility 


There are easy tests for divisibility based on ideas from this chapter. For example, 
writing an integer in decimal a 


a+10b+100c+-:--, 


®More precisely, ae a; 10° where each a; is an integer in {0,1,2,...,9} and ag 4 0. Why did 
we write the decimal expansion so informally in the text, when surely good mathematics is all about 
precision? While good mathematics is anchored by precision, mathematical writing also requires good 
communication—after all why shouldn’t the reader understand with as little effort as possible?—and so 
we attempt to explain accurately with as little notation as possible. 
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we employ Corollary 2.3.1] with f(x) =a+bxr+ cx? +--- and m =9, so that 
a+106+100c+---= f(10)= f(1)=a+b+c+-:- (mod 9). 


Therefore we can test whether the integer a+ 106+ 100c+--- is divisible by 9 by 
testing whether the much smaller integer a+b+c+--- is divisible by 9. In other 
words, if an integer is written in decimal notation, then it is divisible by 9 if and 
only if the sum of its digits is divisible by 9. This same test works for divisibility by 
3 (by exercise 2.1.6) since 3 divides 9. For example, to decide whether 7361842509 
is divisible by 9, we need only decide whether 7+3+6+1+8+4+2+5+0+9 = 45 
is divisible by 9, and this holds if and only if 4+ 5 = 9 is divisible by 9, which it 
obviously is. 

One can test for divisibility by 11 in a similar way: Since 10 = —1 (mod 11), 
we deduce that f(10) = f(—1) (mod 11) from Corollary 2.3.1] and so 

a+10b+100c+---=a—b+c:--: (mod 11). 

Therefore 7361842509 is divisible by 11 if and only if 7-3+6—1+8—4+2-—5+0-9 = 
1 is divisible by 11, which it is not. 


One may determine similar (but slightly more complicated) rules to test for 
divisibility by any integer, though we will need to develop our theory of congruences. 
We return to this theme in section 7.7. 


Exercise 2.4.1. (a) Invent tests for divisibility by 2 and 5 (easy). 

(b) Invent tests for divisibility by 7 and 13 (similar to the above). 
(c)t Create one test that tests for divisibility by 7, 11, and 13 simultaneously (assuming that 
one knows about the divisibility by 7, 11, and 13 of every non-negative integer up to 1000). 


Additional exercises 


Exercise 2.5.1. Prove that if a, b, and c are integers and d = b? — 4dac, then d= 0 or 1 (mod 4). 
Exercise 2.5.2. Prove that if N = a? — b?, then either N is odd or N is divisible by 4. 


Exercise 2.5.3. (a) Prove that 2 divides n(3n + 101) for every integer n. 
(b) Prove that 3 divides n(2n + 1)(n+ 10) for every integer n. 
(c) Prove that 5 divides n(n + 1)(2n + 1)(3n + 1)(4n + 1) for every integer n. 


Exercise 2.5.4. (a) Prove that, for any given integer k > 1, exactly k of any km consecutive 
integers is = a (mod m). 
(b)* Let I be an interval of length N. Prove that the number of integers in J that are = a 
(mod m) is between N/m —1 and N/m +1. 
(c) By considering the number of even integers in (0,2) and then in [0, 2], show that (b) cannot 
be improved, in general. 


Exercise 2.5.5. The Universal Product Code (that is, the bar code used to identify items in the 
supermarket) has 12 digits, each between 0 and 9, which we denote by d1,...,di2. The first 11 
digits identify the product. The 12th is chosen to be the least residue of 


3d1 — dg + 3d3 — dg —-+-— digo + 3di1 (mod 10). 
(a) Deduce that d; + 3d2 + d3 +---+di1 + 3d12 is divisible by 10. 
(b) Deduce that if the scanner does not read all the digits correctly, then either the sum in (a) 
will not be divisible by 10 or the scanner has misread at least two digits. 


Exercise 2.5.6. (a) Take f(x) = x? in Corollary to determine the squares modulo m, 
for m = 3,4,5,6,7,8,9, and 10. (“The squares modulo m” are those congruence classes 
(mod m) that are equivalent to the square of at least one congruence class (mod m).) 
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) Show that there are no solutions in integers x, y,z to x? + y? = z? with a and y odd. 
(c) Show that if x? + y? = z?, then 3 divides zy. 
(d) Show that there are no solutions in integers x,y,z to x? + y? = 3z? with (a,y) = 1. 
) 
) 


Show that there are no solutions in integers x, y, z to x? + y* = 666z? with (a, y) = 1. 
Prove that no integer = 7 (mod 8) can be written as the sum of three squares of integers. 


Exercise 2.5.7.1 Show that if #3 + y? = z3, then 7 divides xyz. 


Binomial coefficients modulo p 


We will assume that p is prime for all of the next two sections. 


Exercise 2.5.8. Use the formula for (?) given in (0.3.1) to prove that p divides (?) for all integers 
j in the range 1 < j < p—1. This implies that 5 (F) is an integer. 


For 1< j < p—1 we can write a ") ag PtP? ... a There is considerable 


cancelation when we reduce this latter expression mod p. 
Exercise 2.5.9. (a) Prove that are =(-1)) (mod p) for all j, O< jf < p—1. 
(b) Prove that 3) = (—1)3-1/j (mod p) for all j, 1< 7 <p—1. 


Exercise 2.5.10.1 (a) Prove that (op) = (;) (mod p) whenever a,b > 0. 
(b) Prove that (a) = ({) - (G) (mod p) whenever 0 < c,d < p—1. (Remember that ($) = 0 
ife<d.) 
(c) Ifm = mo + mipt mop? +--+» + mpgp* and n = no +nip+--:+nzp® are non-negative 
integers written in base p, deduce Lucas’s Theorem (by induction on k > 0), that 


CG ee 


One can extend the notion of congruences to polynomials with integer coef- 
ficients: For f(x), g(a) € Zia] we have f(x) = g(a) (mod m) if and only if there 
exists a polynomial h(x) € Z[a] for which f(x) — g(a) = mh(x). This notion can 
be extended even further to polynomials in several variables. 


The binomial theorem for n = 3 gives 
(x+y)? = 23 + 3a7y 4+ 82y? + y°. 
Notice that the two middle coefficients here are both 3, and so 
(2+y =a? +y* (mod 3). 
Similarly 
(a +y)° = 0° + Baty + 1023 y? + 1027y? + 5ayt* + y®=2°+y°> (mod 5), 


since all four of the middle coefficients are divisible by 5. This does not generalize 
to all exponents n, for example for n = 4 we have (x + y)* = a4 + 2a7y? + y4# 
(mod 4) which is not congruent to x+ + y* (mod 4), but the above does generalize 
to all prime exponents, as we will see in the next exercise. 


Exercise 2.5.11. Deduce from exercise that (2+ y)? = x? + y? (mod p) for all primes rl 


Exercise 2.5.12. Prove that (x + y)?—! = aP—1 — yaP-2 4... — wyP—? + yP-! (mod p). 


“This is sometimes called the freshman’s dream or the child’s binomial theorem, sarcastically referring 
to the unfortunately common mistaken belief that this works over C, rather than the more complicated 
binomial theorem, as in section 
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Exercise 2.5.13. Prove that (« +y)P" = a?" + yr" (mod p) for all primes p and integers k > 1. 


Exercise 2.5.14. (a) Writing a positive integer n = no + nip+nap* +--- in base p, use 


exercise [2.5.13]to prove that 
2 2 
(a+ y)” = (w+ y)P(aP + yP)Pt(aP + yP "2... (mod p). 


(b)* Reprove Lucas’s Theorem (as in exercise 2.5.10(c)) by studying the coefficient of 2’ y"—™ 
in (a). 


Exercise 2.5.15. (a) Prove that (x+y 


)P = aP + yP + zP (mod p). 
(b) Deduce that (a1 + 22 +---+ 2p)? 


| 
x -+ah) (mod p) for all n > 2. 


The Fibonacci numbers modulo d 


The Fibonacci numbers mod 2 are 
0, 1,1, 0) 1,-1,,0) 1, Tyee 


We see that the Fibonacci numbers modulo 2 are periodic of period 3. The Fi- 
bonacci numbers mod 3 are 


O11, 2 1, OH 1 1; BB dy ey, 


and so seem to be periodic of period 8. In exercise [1.7.24] we defined m = mq to 
be the smallest positive integer for which d divides F,,, and showed that d divides 
F,, if and only if mg divides n. In our two cases we therefore have mz = 3 which is 
the period and m3 = 4 which is half the period. 


In the next exercise we show that Fibonacci numbers (and other such sequences) 
are periodic mod d, for every integer d > 1, by using the pigeonhole principle. This 
states that if one puts N + 1 letters into N pigeonholes, then, no matter how one 
does this, some pigeonhole will contain at least two letters) 


Exercise 2.5.16. (a) Prove that the pigeonhole principle is true. 
We will now show that the Mersenne numbers My, := 2” — 1 are periodic mod d. 
(b) Show that there exist two integers in the range 0 < r < s < d for which M; = Ms; (mod d). 
(c) In exercise 0.4.15(b) we saw that the Mersenne numbers satisfy the recurrence M,41 = 
2Mn +1. Use this to show that M,4; = Ms4; (mod d) for all j > 0. 
(d) Deduce that there exists a positive integer p = pg, which is < d, such that Mn4p = Mn 
(mod d) for all n > d. That is, Mn is eventually periodic mod d with period pg < d. 


An analogous proof works for general second-order linear recurrence sequences, 
including Fibonacci numbers. For the rest of this section, we suppose a and b are 
integers and {a,,: n > 0} is the second-order linear recurrence sequence given by 


Ln = ALn—1 + bXy_2 for all n > 2 with zp = 0 and xz; = 1. 


Exercise 2.5.17. (a) By using the pigeonhole principle creatively, prove that there exist two 
integers in the range 0 < r < s < d? for which 2 = as (mod d) and 2,41 = 2541 (mod d). 
(b) Use the recurrence for the a to show that #4; = «%s4; (mod d) for all j > 0. 
(c) Deduce that the x, are eventually periodic mod d with period < d?. 
(d) Prove that mg divides the period mod d. 


5In French, this is the “principle of the drawers”. What invocative metaphors are used to describe 
this principle in other languages? 
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We saw above that the Fibonacci numbers mod 3 have period 3?—1, and further 
calculations reveal that the period mod d never seems to be larger than d? — 1, a 
small improvement over the bound that we obtained in exercise [2.5.17(c). In the 
next exercise we see how to obtain this bound, in general. 


Exercise 2.5.18. (a) Show that if there exists a positive integer r for which x, = v,41 = 0 
(mod d), then z, = 0 (mod d) for all n > r so that the zp are eventually periodic mod d 
with period 1. 
(b) Now assume that there does not exist a positive integer r for which 2, = 2,41 = 0 (mod d). 
Modify the proof of exercise2.5.17]to prove that the x, are eventually periodic mod d with 
period < d? —1. 


It is possible to get a more precise understanding of the Fibonacci numbers and 
other second-order recurrences, mod d: 


Exercise 2.5.19. In order to understand x, (mod d), we take m = mg in the results of this 
exercise. 

(a) Prove, by induction, that tm+z = %m+1vz (mod am) for all k > 0. 

(b) Deduce the same result from exercise [0.4.10 

(c) Deduce that ifn = qm+r withO<r<m-—1, then ay = (@m4i)!x, (mod am). 


We will return to this result in chapter 7 where we study the powers mod n. 


In exercise [0.1.5] we saw the importance of the discriminant) A := a? + 4b of 
the quadratic polynomial x? — az — b. The rule for the x, mod A is a little easier: 


Exercise 2.5.20. Prove by induction that 
(a) wax = ka(—b)*-1 (mod A) and wox41 = (2k +1)(—b)* (mod A) for all k > 0 and 
(b) xox = kab®—! (mod a?) and x44, = b* (mod a?) for all k > 0. 


Exercise 2.5.21. Suppose that the sequence (un)n>1 satisfies a dth-order linear recurrence (as 
defined in appendix 0B). Prove that for any integer m > 1, the un are eventually periodic mod 
m with period < m@ — 1. (We prove that this bound is best possible when m is prime in exercise 


7.25.5)) 


©The colon “:” plays many roles in the grammar of mathematics. Here it means that “Henceforth 
we define A to be....” 


Appendix 2A. Congruences 
in the language of groups 


2.6. Further discussion of the basic notion of congruence 


Congruences can be rephrased in the language of groups. The integers, Z, form a 
eroup 4 in which addition is the group operation. In exercise 0.11.1 of appendix 
OD we proved that the non-trivial, proper subgroups of Z all take the form mZ := 
{mn: n € Z} for some integer m > 1, that is, the set of integers divisible by m. 
The congruence classes (mod m) are simply the cosets of mZ inside Z: 


O0+mZ, 14+mZ, 24+mZ,..., (m—1)4+mZ, 


where 

j4tmZ :={j+mn: ne Z}, 
which is the set of integers belonging to the congruence class 7 (mod m). Notice 
that the m cosets of mZ are disjoint and their union gives all of Z. 


The group operation on Z, namely addition, is inherited by the cosets of mZ. 
For example, as 7+ 11 = 18 in Z, the same is true when we add together the 
relevant cosets of mZ in Z; in other words 


(7+ mZ)+(11+mZ) = (184+ mZ). 
This new additive group is the quotient group 
Z/mZ. 


This is the beginning of the theory of quotient groups, which we develop in the 
next section. 


1°See appendix OD for a discussion of the basic properties of groups. 

11Throughout, we define the sum of two given sets A and B to be A+ B:= {a+b: a€ A,bvE B}, 
that is, the set of elements that can be represented as a+b with a € A and b€ B. Note that an element 
may be represented more than once. 
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The reader should be aware that multiplication mod m (and, in particular, 
how its properties are inherited from Z) does not fit into this discussion of additive 
quotient groups. 


2.7. Cosets of an additive group 


Suppose that H is a subgroup of an additive (and so abeliar{!?)) group G. A coset 
of H in G is given by the set 


a+H:={ath: he H}. 


In Proposition [2.7.1] we will show, as in the example mZ of the previous section, 
that the cosets of H are all disjoint and their union gives G. 


The quotient group G/H has as its elements the distinct cosets a + H and 
inherits its group law from G, in this case addition, so that 


(a+ H)+(b4+ H) =(a+0)4+ 4H. 


Proposition 2.7.1. Let H be a subgroup of an additive group G. The cosets 
of H in G are disjoint, so that the elements of G/H are well-defined; and the 
addition law on G/H is also well-defined. If G is finite, then |H| divides |G| and 
|G/H| = |G|/|H]. 


Proof. If a+ H and 6+ H have a common element c, then there exists hi, hz € H 
such that a +h; = c = b+ ho. Therefore b = a+ hy, — ho = a+ ho where 
ho = hi — ha € H since H is a group (and therefore closed under addition). Now if 
h € H, then b+h =a+(ho+h) € a+H4H, asho+h € H, so that b+ H C a+H, and 
by the analogous argument a+ H C b+ H. We deduce that a+ H =b+H. Hence 
the cosets of H are either identical or disjoint, which means that they partition G; 
therefore if G is finite, then |H| divides |G]. 


This also implies that if c€ a+ H, then c+ H =a+H. We wish to show that 
addition in G/H is well-defined. If a+ H, b+ H are cosets of H, then we defined 
(a+H)+(b+H) = (a+b)+H, so we need to verify that the sum of the two cosets 
does not depend on the choice of representatives of the cosets. So, ifc € a+ H and 
d € b+ H, then there exists hy, hy € H for which c=a+h, andd=b+hg. Then 
c+ H =a+H4H andd+H =b+H. Moreover c+ d=a+b+(hi the) €a+b+H, 
as H is closed under addition, and soc+d+H =a+b+4H, as desired. Hence 
G/H is well-defined, and |G/H| = |G|/|H| when G is finite. 


Example. Z is a subgroup of the additive group R, and the cosets a+ Z are given 
by all real numbers r that differ from a by an integer. Every coset a+Z has exactly 
one representative in any given interval of length 1, in particular the interval [0, 1) 
where the coset representative is {a}, the fractional part of a. These cosets are 
well-defined under addition and yield the quotient group R/Z. 

The exponential mape: RU := {z €C: |z| = 1}, from the real numbers to 
the unit circle, is defined by e(t) = e?’**. Since e(1) = 1, therefore e(n) = e(1)” = 1 
for every integer n. Therefore if b € a+ Z so that b = a+ n for some integer n, 
then e(b) = e(a+n) = e(a)e(n) = e(a), so the value of e(t) depends only what 


124 group G is called abelian or commutative if ab = ba for all elements a,b € G. 
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coset t belongs to in R/Z. Therefore we can think of the exponential map as the 
concatenation of two maps: firstly the natural quotient map from R > R/Z (that 
is, a + a+Z) and then the map e: R/Z — U. Picking the representatives [0, 1) 
for R/Z, we see that the restricted map e: [0,1) > U is 1-to-1. 


By a slight abuse of terminology, we let a = b (mod 1), for real numbers a and 
b, if and only if a and b belong to the same coset of R/Z. 


Exercise 2.7.1. Prove that a = b (mod m) if and only if a/m and b/m belong to the same coset 
of R/Z. 


Exercise 2.7.2. (a) Prove that t = {t} (mod 1) for all real numbers t. 
(b) Prove that the usual rules of addition, subtraction, and multiplication hold mod 1. 
(c) Show that division is not always well-defined mod 1, by finding a counterexample. 


2.8. A new family of rings and fields 


We have seen, in Lemma[2.1.1] that the congruence classes mod m support both 
an additive and multiplicative structure. 


Exercise 2.8.1. Prove that Z/mZ is a ring for all integers m > 2. 


To be a field, all the non-zero congruence classes of Z/mZ would need to have a 
multiplicative inverse, but this is not the case for all m. For example we claim that 
3 does not have a multiplicative inverse mod 15. If it did, say 3m = 1 (mod 15), 
then multiplying through by 5 we obtain 5 =5-1=5-3m=0 (mod 15), which is 
evidently untrue. 


We call 3 and 5 zero divisors since they non-trivially divide 0 in Z/15Z. 


Exercise 2.8.2. (a) Prove that if m is a composite integer > 1, then Z/mZ has zero divisors. 
(b) Prove that Z/mZ is not a field whenever m is a composite integer > 1. 
(c) Prove that if R is any ring with zero divisors, then R cannot be a field. 


An integral domain is a ring with no zero divisors. Note that Z is an integral 
domain (hence the name) but is not a field. 


If R is a commutative ring and m € R, then mR is an additive subgroup of 
R, and the cosets of mR support a multiplicative structure. To see this, note that 
ifs €a+mR and y € b+ mR, then = a+mr, and y = b+ mr for some 
r1,T2 € R, and so zy = ab+ mr where r = arg + bry + mryr2 which belongs to 
R, as R is closed under both addition and multiplication. That is, ry € ab+ mR. 
Hence R/mR inherits the multiplicative and distributive properties of R, as well 
as the identity element 1+ mR; and so R/mR is itself a commutative ring. 


2.9. The order of an element 


If g is an element of a given group G, we define the order of g to be the smallest 
integer n > 1 for which g” = 1, where 1 is the identity element of G. If n does not 
exist, then we say that g has infinite order (for example, 1 in the additive group Z). 
We shall explore the multiplicative order of a reduced residue mod m, in detail, in 
chapter 7. 


There is a beautiful observation of Lagrange which restricts the possible order 
of an element in any finite abelian group. 
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Theorem 2.2 (Lagrange). If G is a finite abelian group, then the order of any 
element g of G divides |G|, the number of elements in G. Moreover, giGl =1. 


Proof. Suppose that g has order n and let H := {1,9,g7,...,g" +}, a subgroup of 
G of order n. By Proposition 2.7.1] we deduce that n = |H| divides |G|. Moreover 
if |G| = mn, then gl@l = g™™ = (g")™ =1™ =1. 


Lagrange’s Theorem actually holds for any finite group, non-abelian as well as 
abelian, as we will see in Corollary [7.23.1] of appendix 7D. 


Appendices. The extended version of chapter 2 has the following additional 
appendix: 
Appendix 2B. The Euclidean algorithm for polynomials, which shows that there is 
an analogous theory for polynomials. 


Chapter 3 


The basic algebra 
of number theory 


A prime number is an integer n > 1 whose only positive divisors are 1 and n. Hence 
2,3,5,7,11,... are primes. An integer n > 1 is composite if it is not prime[}| 


Exercise 3.0.1. Suppose that p is a prime number. Prove that gcd(p, a) = 1 if and only if p does 
not divide a. 


3.1. The Fundamental Theorem of Arithmetic 


Positive integers factor into primes, the basic building blocks out of which integers 
are made. Often, in school, one discovers this by factoring a given composite integer 
into two parts and then factoring each of those parts that are composite into two 
further parts, etc. For example 120 = 8 x 15, and then 8 = 2x 4 and 15 =3- 5. 
Now 2, 3, and 5 are all primes, but 4 = 2 x 2 is not. Putting this altogether gives 
120 = 2x 2x 2x 3x 5. This can be factored no further since 2, 3, and 5 are all 
primes. It is not difficult to prove that this always works: 


Exercise 3.1.1. Prove that any integer n > 1 can be factored into a product of primes. 


We can factor 120 in other ways. For example 120 = 4 x 30, and then 4 = 2 x 2 
and 30 = 5 x 6. Finally noting that 6 = 2 x 3, we eventually obtain the same 
factorization, 120 = 2 x 2 x 2 x 3 x 5, of 120 into primes, even though we arrived 
at it in a different way. No matter how you go about splitting a positive integer up 
into its factors, you will always end up with the same factorization into primes[?] If 
it is true that any two such factorizations are indeed the same and if we are given 
one factorization of n as q,---+q,, then every prime factor p of n, found in any other 
way, must equal some q;. This suggests that we will need to prove Theorem [3.1] 


1Notice that 1 is neither prime nor composite, and the same is true of 0 and all negative integers. 

? Recognizing that this claim needs a proof and then supplying a proof, is one of the great achieve- 
ments of Greek mathematics. They developed an approach to mathematics which assures that theorems 
are established on a solid basis. 
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Theorem 3.1. If prime p divides ab, then p must divide at least one of a and b. 


We will prove this in the next subsection. The necessity of such a result was 

appreciated by ancient Greek mathematicians, who went on to show that Theorem 

[3. lis sufficient to establish that every integer has a unique factorization, as we will 

see. It is best to begin by making a simple deduction from Theorem B.1] 

Exercise 3.1.2. (a) Prove that if prime p divides a,a2---a,, then p divides a; for some j, 1 < 
ISK. 

(b) Deduce that if prime p divides q1---q, where each q; is prime, then p = qj for some 

ji, 1SG Sk. 


With this preparation we are ready to prove the first great theorem of number 
theory, which appears in Euclid’s “Elements” | 


Theorem 3.2 (The Fundamental Theorem of Arithmetic). Every integer n > 1 
can be written as a product of primes in a unique way (up to reordering). 


Proof. We first show that there is a factorization of n into primes and afterwards 
we will prove that it is unique. We prove this by induction on n: If n is prime, then 
we are done; since 2 and 3 are primes, this also starts our induction hypothesis. If 
n is composite, then it must have a divisor a for which 1 < a <n, and so b= n/a 
is also an integer for which 1 < b < n. Then, by the induction hypothesis, both 
a and 6b can be factored into primes, and so n = ab equals the product of these 
two factorizations. (For example, to prove the result for 1050, we note that 1050 = 
15x70. We have already obtained the factorizations of 15 and 70, namely 15 = 3x5 
and 70 = 2x 5x7, so that 1050 = 15 x 70 = (3x5) x (2x5x 7) =2x3x5x5x7.) 


Now we prove that there is just one factorization for each n > 2. If this is not 
true, then let n be the smallest integer > 2 that has two distinct factorizations, 
Pip2***Pr = 192°" ds; 


where the p; and q; are (not necessarily distinct) primes. Now prime p, divides 
M192°**s, and so p, = q; for some j, by exercise [3.1.2(b). Reordering the q; if 
necessary we may assume that 7 = s, and if we divide through both factorizations 
by py = ds, then we have two distinct factorizations of 


n/Pr = Pip2-**Pr-1 = 4192°** Ys-1- 
This contradicts the minimality of n unless n/p, = 1. But then n = p, is prime, 
and by the definition (of primes) it can have no other factorization. 


The Fundamental Theorem of Arithmetic states that there is a unique way to 
break down an integer into its fundamental (i.e., irreducible) parts, and so every 
integer can be viewed simply in terms of these parts (i.e., its prime factors). On 
the other hand any finite product of primes equals an integer, so there is a 1-to-1 
correspondence between positive integers and finite products of primes, allowing one 
to translate questions about integers into questions about primes and vice versa. 


3When we write that a product of primes is “unique up to reordering” we mean that although one 
can write 12 as 2 x 2x 3 or 2x 3 X 2 or 3 X 2 X 2, we think of all of these as the same product, since 
they involve the same primes, each the same number of times, differing only in the way that we order 
the prime factors. 
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It is useful to write the factorizations of natural numbers n in a standard form, 
like 
NM SQM QM HME TOT 
where nm, denotes the exact number of times the prime p divides n. Since n is an 
integer, each np > 0, and only finitely many of the np are non-zero. Usually we 
write down only those prime powers where n, > 1, for example 12 = 2?- 3 and 
50 = 2-5. We will write p°||n if p® is the highest power of p that divides n; thus 
5°||50 and 111|/1001. 


Our proof of the Fundamental Theorem of Arithmetic is constructive but it does 
not provide an efficient way to find the prime factors of a given integer n. Indeed 
finding efficient techniques for factoring an integer is a difficult and important 
problem, which we discuss in chapter 10/4 


? 


In particular, the known difficulty of factoring large integers underlies the se- 
curity of the RSA cryptosystem, which is discussed in section [10.3] 


Exercise 3.1.3. (a) Prove that every natural number has a unique representation as 2*m with 

k > 0 and m an odd natural number. 

(b) Show that each integer n > 3 is either divisible by 4 or has at least one odd prime factor. 

(c) An integer is squarefree if every prime in its factorization appears to the power 1. Prove that 
every non-zero integer can be written, uniquely, in the form mn? where m is a squarefree 
integer and n is a non-zero positive integer. 

(d)t Deduce that every non-zero rational number can be written, uniquely, in the form mr 
where m is a squarefree integer and r is a positive rational number. 
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Exercise 3.1.4. (a) Show that if all of the prime factors of an integer n are = 1 (mod m), 
then n = 1 (mod m). Deduce that if n #1 (mod m) then n has a prime factor that is # 1 
(mod m). 
(b)* Show that if all of the prime factors of an integer n are = 1 or 3 (mod 8), then n=1 or 3 
(mod 8). Prove this with 3 replaced by 5 or 7. 
(c)t Generalize this as much as you can to other moduli and other sets of congruence classes. 


3.2. Abstractions 


The ancient Greek mathematicians recognized that abstract lemmas allowed them 
to prove sophisticated theorems. For example, in the previous section we stated 
Theorem a result whose formulation is not obviously relevant and yet was used 
to good effect. The archetypal lemma is known today as “Euclid’s Lemma”, an 
important result that first appeared in Euclid’s “Elements” (Book VII, No. 32), 
and we will see that it is even more useful than Theorem [3. 1} 


Theorem 3.3 (Euclid’s Lemma). [fc divides ab and gcd(c, a) = 1, then c must 
divide b. 


4Tt is easy enough to multiply together two given integers. If the integers each have 50 digits, 
then one can obtain the product in about 3,000 steps (digit-by-digit multiplications) and this can be 
accomplished within a second on a computer. On the other hand, given the 100-digit product, how do 
we factor it to find the original two 50-digit integers? Trial division is too slow ... if every atom in 
the universe were a computer as powerful as any supercomputer, then most such products would not be 
factored before the end of the universe! This is why we need more sophisticated factoring methods, and 
although the best ones known today, implemented on the best computers, can factor a 100-digit number 
in reasonable time, they are currently incapable of factoring typical 200-digit numbers. (See sections 
[10.4] and [10.6] for further discussion on this theme.) 
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Proof of Euclid’s Lemma. Since gcd(c, a) = 1 there exist integers m and n such 
that cm + an =1 by Theorem[L.1] Now c divides both ¢ and ab, so that 


c divides c-bm+ab-n=b(cem+an) =}, 
by exercise [L1.Ic). 


This proof surprisingly uses, inexplicitly, the complicated construction from Euclid’s 
algorithm. Now that we have proved Euclid’s Lemma we proceed to 


Deduction of Theorem Suppose that prime p does not divide a (or else we 
are done), and so gced(p, a) = 1 (as seen in exercise 3.0.1). Taking c = p in Euclid’s 
Lemma, we deduce that p divides b. 


The hypothesis “gcd(c, a) = 1” in Euclid’s Lemma is necessary, as may be seen 
from the example in which 4 divides 2-6, but 4 does not divide either 2 or 6. 


Now that we have completed the proof of the Fundamental Theorem of Arith- 
metic, we are ready to develop the basic number-theoretic properties of integers[| 
We begin by noting one further important consequence of Euclid’s Lemma: 


Corollary 3.2.1. If am = bn, then a/ gcd(a,b) divides n. 


Proof. Let a/gcd(a,b) = A and b/ gcd(a,b) = B so that (A, B) = 1 by exercise 
a) and Am = Bn. Therefore A|Bn with (A,B) = 1, and so A|n by Euclid’s 
Lemma, as desired. We also observe that if we write n = Ak for some integer k, 
then m = Bn/A = Bk. 


One consequence is a simple way to determine the least common multiple of 
two integers from knowing their greatest common divisor. 


Corollary 3.2.2. For any positive integers a and b we have ab = gcd(a, b)-lcm(a, b). 


Proof. By definition, there exist integers m and n for which am = bn = lcmf[a, b]. 
By Corollary B.2.1] we know that a/ ged(a, b) divides n and so L := b- a/ gced(a, b) 
divides bn = lcm[a, b]. Therefore L < lcm{a,b]. On the other hand L is a multiple 
of b, by definition, and of a, since L = a-b/gcd(a,b). Therefore L is a common 
multiple of a and b, and so L > l|cm[a,6] by the definition of lem. These two 
inequalities imply that L = lcm|a, b], and the result follows by multiplying through 
by the denominator. 


We will see an easier proof of this elegant result in exercise [3.3.2 


Exercise 3.2.1. Suppose that (a,b) = 1. Prove that if a and b both divide m, then ab divides m. 


5 However if we wish to develop the analogy of this theory for more complicated sets of numbers, 
for example the numbers of the form {a + bVd: a,be Z} for some fixed large integer d, then Euclid’s 
Lemma generalizes in a straightforward way, but the Fundamental Theorem of Arithmetic does not. We 
discuss this further in appendix 3F. 
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3.3. Divisors using factorizations 


Suppose thatl§| 
n= II pp’, a= [[. and b= [[»”. 
p prime p p 
If n = ab, then 
gn2gnsnns ... — 902303545... gb2 33 pbs eae ga2+be 303 +b3 5a5+bs nee 
As there is only one factorization into primes of a given positive integer, by the 
Fundamental Theorem of Arithmetic, we can equate the exact power of prime p 
dividing each side of the last equation, to deduce that 
Np =A, +b, for each prime p. 
As a,,b, > 0 for each prime p, therefore 
O<a,y, bp <n» for each prime p. 

On the other hand if a = 2973%35% --- with each 0 < ay < np, then a divides n 
since we can construct the integer 

b = 92 42 3N3— 43 HN5 — 45 sel. i 
for which n = ab. We have therefore classified all of the possible (positive integer) 
divisors a of n. 

This classification allows us to easily count the number of divisors a of n, since 
this is equal to the number of possibilities for the exponents a,; and we have that 
each a, is any integer in the range 0 < ap < np. There are, therefore, n, + 1 
possibilities for the exponent ap, for each prime p, making 

(n2 + 1)(n3 + 1) (ns + 1) fis 


possible divisors in total. Hence if we write r(n) for the number of divisors of n, 


then 
r(n)= JJ rt”); 


p prime 
p"P||n 


and t(p*) = k +1 for all integers k > 0. A function whose value at n equals 
the product of the values of the function at the exact prime powers that divide 
n is called a multiplicative function (which will be explored in detail in the next 
chapter). 


As an example, we see that the divisors of 175 = 577! are given by 
5°79 = 1, 517° =5, 5779 = 25, 5°71 =7, 5171 = 35, 5777 = 175; 
in other words, they can all be factored as 
5°51, or 5° times 7° or 7}. 
Therefore the number of divisors is (2+ 1) x (1+1)=3x2=6. 


Use the Fundamental Theorem of Arithmetic in all of the remaining exercises 
in this section. 


°We suppress writing “prime” in the subscript of [], for convenience, at least when it should be 
obvious, from the context, that the parameter is only taking prime values. 
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Exercise 3.3.1. Use the description of the divisors of a given integer to prove the following: If 
m=[],p™” and n = [],,p”” are positive integers, then (a) gcd(m,n) = J], pmin{mp. pr} and 
(b) lom|m, n] = [I], pmaxt™nr}, 


The method in exercise B.3.1(a) for finding the gcd of two integers appears to 
be much simpler than the Euclidean algorithm. However, in order to make this 
method work, one needs to be able to factor the integers involved. We have not yet 
discussed techniques for factoring integers (though we will in chapter 10). Factoring 
is typically difficult for large integers. This difficulty limits when we can, in practice, 
use exercise [3.3.1] to determine gcds and lems. On the other hand, the Euclidean 
algorithm is very efficient for finding the gcd of two given integers (as discussed in 
appendix 1B) without needing to know anything about those numbers. 


Exercise 3.3.2. Deduce that mn = gcd(m,n) -lem[m, n] for all pairs of natural numbers m and 
n using exercise[3.3.1] (The proof in Corollary is more difficult.) 


In combination with the Euclidean algorithm, the result in exercise al- 
lows us to quickly and easily calculate the lcm of any two given integers. For 
example, to determine Icm[12,30], we first use the Euclidean algorithm to show 
that gcd(12, 30) = 6, and then Icm[12, 30] = 12 - 30/ gcd(12, 30) = 360/6 = 60. 


Although we have already proved the results in the next exercise (exercise 


[L.2.i/a), Lemma[L.4.1] exercise [I.2.5/a), and Corollary [.2.2), we can now reprove 


them more easily by using our description of the divisors of a given integer. 


Exercise 3.3.3. (a) Prove that d divides gcd(a, b) if and only if d divides both a and b. 
(b) Prove that lcm[a, b] divides m if and only if a and 6b both divide m. 
(c) Prove that if (a,b) = g, then (a/g,b/g) = 1. 
(d) Prove that if (a,m) = (b,m) = 1, then (ab,m) = 1. 
(e) Prove that if (a,b) = 1, then (ab,m) = (a,m)(b,m). 
(f)' Show that the hypothesis (a,b) = 1 is necessary in part (e), by constructing a counterex- 
ample to the conclusion when (a,b) > 1. 


One can obtain the gcd and lem for any number of integers by means similar 


to exercise 3.3.1} 


Example. If A = 504 = 2° - 32-7, B = 2880 = 2° - 32-5, and C = 864 = 2° - 33, 
then the greatest common divisor is 23 - 3? = 72 and the least common multiple is 
26 .33.5-7 = 60480. That is, if the powers of prime p that divide A, B, and C are 
Gp, bp, and Cp, respectively, then the powers of p that divide the gcd and lcm are 
min{a@p, bp, cp} and max{ap, bp, cp}, respectively. 

Exercise 3.3.4. Prove that gcd(a, b,c) = gcd(a, gcd(b, c)) and Icm{a, b, c] = lem|[a, lcm{b, c]]. 


Exercise 3.3.5. Prove that if each of a,b,c,... is coprime with m, then so is abc.... 


The representation of an integer in terms of its prime power factors can be 
useful when considering powers of integers: 


Exercise 3.3.6. Prove that if prime p divides a”, then p” divides a”. 


Exercise 3.3.7. (a) Prove that a positive integer A is the square of an integer if and only if 
the exponent of each prime factor of A is even. 
(b) Prove that if a,b,c,... are pairwise coprime, positive integers and their product is a square, 
then they are each a square. 


3.4, Irrationality 49 


(c) Prove that if ab is a square, then either a = gA? and b = gB?, or a = —gA? and b = —gB?, 
where g = gcd(a, b), for some coprime integers A and B. 
Exercise 3.3.8. (a) Prove that a positive integer A is the nth power of an integer if and only 
if n divides the exponent of all of the prime power factors of A. 
(b) Prove that if a,b,c,... are pairwise coprime, positive integers and their product is an nth 
power, then they are each an nth power. 


3.4. Irrationality 


One of the most beautiful applications of the Fundamental Theorem of Arithmetic 
is its use in showing that there are real irrational numbers|"] the easiest example 


being V2: 


Proposition 3.4.1. The real number V2 is irrational. That is, there is no ratio- 
nal number a/b for which V2 = a/b. 


Proof. We will assume that V2 is rational and find a contradiction. If 2 is 
rational, then we can write 2 = a/b where a and b > 1 are coprime integers by 
exercise b). We have a = bV2 > 0. 


Now a = byv2 and so a? = 2b?. If we factor 
a= [[-” and b= [[»”. then Ie” =p? = oF = 272. 
p e ; 


where the a,’s and 6,’s are all integers. The exponent of the prime 2 in the fac- 
torization of a? = 2b? is 2a2 = 1+ 2b which is impossible (mod 2), giving a 
contradiction. Hence 2 cannot be rational. 


More generally we have, by a different proof, 


Proposition 3.4.2. If d is an integer for which Vd is rational, then Vd is an 
integer. Therefore if integer d is not the square of an integer, then Vd is irrational. 


Proof. We may write Vd = a/b where a and 6 are coprime positive integers, so 
that a? = db?. Now (a,b?) = 1 and a? divides db?, which implies that a? divides 
d, by Euclid’s Lemma. But then d < db? = a? < d, implying that d = a?; that is, 
d is the square of an integer as claimed. 


Exercise 3.4.1. Give a proof of Proposition which is analogous to the proof of Proposition 
above. 


Exercise 3.4.2.1 Prove that 17!/° is irrational (using the ideas of the proof of PropositionB.4.i). 


The proof of Proposition |3.4.2] generalizes to give a nice application of Euclid’s 
Lemma to rational roots of arbitrary polynomials with integer coefficients: 


Theorem 3.4 (The rational root criterion). Suppose f(x) is a polynomial with 
integer coefficients, with leading coefficient aq and last coefficient ag. If f(m/n) = 0 
where m and n are coprime integers, then m divides ag and n divides ag. 


7That is, real numbers that are not rational. 
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Proof. Writing f(x) = aye a;x) where each a; € Z we have 
agqm? + ag_-im?'n+-++-+a;mn*! 4+ apn? = n@ f(m/n) = (0, 


Reducing this equation mod n gives agm? = 0 (mod n) as every other term on the 
left-hand side is divisible by n. This can be restated as n divides agm“. By the 
hypothesis, we have (n,m) = 1 and so (n,m*“) = 1 by exercise [7.11] Therefore n 
divides agm* and (n,m) = 1, which implies that n divides ag by Euclid’s Lemma. 
We complete the proof by establishing 


Exercise 3.4.3. Prove that m divides ag by reducing the above equation mod m. 


Corollary 3.4.1. If a monic polynomial f(x) € Z[x] has a rational root, then that 
root must be an integer. 


Proof. We have ag = 1 as f is monic. Therefore n = +1 in the rational root 
criterion, which implies that m/n = +m, an integer. 


We can apply Corollary to the rational roots of the polynomial x” — d, 
and so we deduce that if d!/” is rational, then d'!/” is an integer (and therefore if 
d‘/” is not an integer, then it is irrational), generalizing Proposition B.4.2] 


We have now proved that there exist infinitely many irrational numbers, the 
numbers Vd when d is not the square of an integer. This caused important philo- 
sophical conundrums for the early Greek mathematicians|® 


Exercise 3.4.4. Prove that the polynomial 2? — 3x — 1 is irreducible over Q. 


3.5. Dividing in congruences 


We are now ready to return to the topic of dividing both sides of a congruence 
through by a given divisor, resolving the conundrums raised in section 2.2] 


Lemma 3.5.1. If d divides both a and b and a = b (mod m), then 
a/d=b/d (mod m/g) where g=gcd(d,m). 


8 Ancient Greek mathematicians did not think of numbers as an abstract concept, but rather as 
units of measurement. That is, one starts with fixed length measures and determines what lengths can 
be measured by a combination of those original lengths: A stick of length a can be used to measure any 
length that is a positive integer multiple of a (by measuring out k copies of length a, one after another). 
Theorem [i.i]can be interpreted as stating that if one has measuring sticks of length a and b, then one 
can measure length gcd(a, b) by measuring out u copies of length a and then v copies of length b, to get 
total length au + bv = gcd(a,b). One can then measure out any multiple of gcd(a,b) by copying the 
above construction that many times. 

Pythagoras (* 570-495 B.C.) traveled to Egypt and perhaps India in his youth on his quest for 
understanding. In 530 B.C. he founded a mystical sect in Croton, a Greek colony in southern Italy, 
which developed influential philosophical theories. Pythagoreans believed that numbers must be con- 
structible in a finite number of steps from a finite given set of lengths and so erroneously concluded that 
no irrational number could be constructed in this way. However an isosceles right-angled triangle with 
two sides of length 1 has a hypotenuse of length V2, and so the Pythagoreans believed that V2 must be 
a rational number. When one of them proved Proposition it contradicted their whole philosophy 
and so was suppressed, “for the unspeakable should always be kept secret”! 

We looked at what types of lengths are “constructible” using only a compass and a straight edge 
in section of appendix OG. In fact, although the constructible lengths are quite restricted, they are, 
nonetheless, a far richer set of numbers than just the rational numbers. 

The Pythagoreans similarly associated the four regular polygons that were then known (the Pla- 
tonic solids after Plato) with the four “elements”—the tetrahedron with fire, the cube with earth, the 
octahedron with air, and the icosahedron with water—and so believed that there could be no others. 
They also suppressed their discovery of a fifth regular polygon, the dodecahedron. 
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Proof. As d divides both a and b, we may write a = dA and b = dB for some 
integers A and B, so that dA = dB (mod m). Hence m divides d(A — B) and 
therefore ™ divides “(A — B). Now ged(%, *) = 1 by exercise a), and so ™ 
divides A — B by Euclid’s Lemma. This is the result that was claimed. 


For example, 14 = 91 (mod 77). Now 14 = 7 x 2 and 91 = 7 x 13, and so 
we divide 7 out from 77 to obtain 2 = 13 (mod 11). More interestingly 12 = 42 
(mod 15), and 6 divides both 12 and 42. However 6 does not divide 15, so we cannot 
divide this out from 15, but rather we divide out by gcd(15,6) = 3 to obtain 2 = 7 
(mod 5). 


Corollary 3.5.1. Suppose that (a,m) = 1. 
(i) u=v (mod m) if and only if au = av (mod m). 
(ii) The residues 

(3.5.1) a.0, al, ..., a.(m—1) 


form a complete set of residues (mod m). 


Proof. (i) The third congruence of Lemma [2.1.J] implies that if uw = uv (mod m), 
then au = av (mod m). In the other direction, we take a,b,d in Lemma[3.5.1] to 
equal au, av, a, respectively. Then g = (a,m) = 1, and so au = av (mod m) 
implies that u=v (mod m) by LemmaB.5.1] 

(ii) By part (i) we know that the residues in (8.5.1) are distinct mod m. Since 
there are m of them, they must form a complete set of residues (mod m). 


Corollary B.5.1{ii) states that the residues in (8.5.1) form a complete set of 
residues (mod m). In particular one of them is congruent to 1 (mod m); and so we 
deduce the following: 


Corollary 3.5.2. If (a,m) = 1, then there exists an integer r such that ar = 1 
(mod m). We call r the inverse of a (mod m). We denote this by 1/a (mod m), 
or a~* (mod m); some authors write Z (mod m). 


Third proof of Theorem [For any positive integers a, b, there exist integers 
u and v such that au + bv = gcd(a, b).] Let g = gcd(a, b) and write a = gA,b = gB 
so that (A, B) = 1. By Corollary B.5.2) there exists an integer r such that Ar = 1 
(mod B), and so there exists an integer s such that Ar—1 = Bs; that is, Ar—Bs = 
1. Therefore ar — bs = g(Ar — Bs) = g-1=g = gcd(a,b), as desired. 


This also goes in the other direction: 


Second proof of Corollary By Theorem [L.1] there exist integers u and v 
such that au + mv = 1, and so 


au =au+mv=1 = (mod m). 


Therefore wu is the inverse of a (mod m). 
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Exercise 3.5.1. Assume that (a,m) = 1. 
(a) Prove that if b is an integer, then a.0+b, a.1+b, ..., a(m—1)+6 form a complete set of 
residues (mod m). 
(b) Deduce that for all given integers b and c, there is a unique value of z (mod m) for which 
ax +b=c (mod m). 


If (a,m) = 1, then we can (unambiguously) express the root of az = c (mod m) 
as ca~+ (mod m), or c/a (mod m); we take this to mean the residue class mod m 
which contains the unique value from exercise [3.5.I[b). For example 19/17 = 11 
(mod 12). Such quotients share all the properties described in Lemma[2.1.1] 


Exercise 3.5.2. Prove that if {r1,...,rx~} is a reduced set of residues mod m and (a,m) = 1, 
then {ari1,...,arx} is also a reduced set of residues mod m 


Exercise 3.5.3. (a) Show that there exists r (mod b) for which ar = c (mod 6) if and only if 
gcd(a, b) divides c. 
(b)t Prove that the solutions r are precisely the elements of a residue class mod b/ gcd(a, b). 


Exercise 3.5.4. Prove that if (a,m) > 1, then there does not exist an integer r such that ar = 1 
(mod m). (And so Corollary [3.5.2]could have been phrased as an “if and only if” condition.) 


Exercise 3.5.5. Explain how the Euclidean algorithm may be used to efficiently determine the 
inverse of a (mod m) whenever (a,m) = 1. (Calculating the inverse of a (mod m) is an essential 
part of the RSA algorithm discussed in section [10.3]) 


3.6. Linear equations in two unknowns 


Given integers a,b,c, can we determine all of the integer solutions m,n to 
am+bn=c? 
Example. To find all integer solutions to 4m + 6n = 10, we begin by noting that 


we can divide through by 2 to get 2m + 3n = 5. There is clearly a solution, 
2-1+3-1=5. Therefore 


2m+3n=5=2-1+3-1, 
so that 2(m— 1) = 3(1—1n). We therefore need to find all integer solutions u,v to 
2u = 3u 


and then the general solution to our original equation is given by m=1+u, n= 
1 — v, as we run over the possible pairs u,v. Now 2|3v and (2,3) = 1 so that 
2\v. Hence we may write v = 2¢ for some integer @ and then deduce that u = 32. 
Therefore all integer solutions to 4m + 6n = 10 take the form 

m=1+430, n=1-—- 20, for some integer @. 


We can imitate this procedure to establish a general result: 


Theorem 3.5. Let a,b,c be given integers. There are solutions in integers m,n 
toam+bn = c if and only if (a,b) divides c. Given a first solution, say r,s (which 
can be found using the Euclidean algorithm), all integer solutions to am + bn =c 
are then given by the formula 


b 
m=r+———~lf, n=s- ae, for some integer €. 


(a,b) (a, b) 
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The full set of real solutions to ax + by = c is given by 
x=r+kb, y=s-—ka, where k is an arbitrary real number. 
By Theorem B.5] these are integer solutions exactly when k = ¢/(a,b) for some 
LEZ. 
In the discussion above we saw that it is best to “reduce” this to the case when 
(a,b) = 1. 


Corollary 3.6.1. Let a,b,c be given integers with (a,b) =1. Given a first solution 
in integers r,s to ar + bs =c, all integer solutions to am + bn = c are then given 
by the formula 


m=r+bl, n=s—atl for some integer £. 


Deduction of Theorem from Corollary If there is a solution in in- 
tegers m,n to am+bn = c, then g := (a,b) divides a, b and am+ bn = c, so we can 
write a = Ag, b = Bg, c = Cg for some integers A, B,C with (A, B) = 1. We now 
determine the integer solutions to Am + Bn = C, where (A, B) = 1 by Corollary 
3.0.1 


Proof #1 of Corollary 3.6.1} If 
am+bn=c=ar+0s, 
then 
a(m—r) = b(s —n). 
We therefore need to find all integer solutions u,v to 
au = bv. 


In any given solution a divides v by Euclid’s Lemma as (a,b) = 1, and so we may 
write v = af for some integer & and deduce that wu = bé. We then deduce the 
claimed parametrization of integer solutions to am + bn = c. 


Exercise 3.6.1. Show that if there exists a solution in integers m,n to am+bn = c with (a, b) = 1, 
then there exists a solution with 0 < m < b. 


Proof #2 of Corollary There is an inverse to a (mod b), as (a,b) = 1; 
call it r. Let m be any integer = rc (mod b), so that am = arc = c (mod b), and 
therefore there exists an integer n for which am + bn =c. The result follows. 


Exercise 3.6.2. (a) Find all solutions in integers m,n to 7m+5n=1. 
(b) Find all solutions in integers u,v to 7v — 5u = 3. 
(c) Find all solutions in integers j,k to 37 — 9k = 1. 
(d) Find all solutions in integers r,s to 5r — 10s = 15. 


Exercise 3.6.3. Show that a linear equation am + bn = c where a, b, and c are given integers, 
cannot have exactly one solution in integers m,n. 


An equation involving a congruence is said to be solved when integer values 
can be found for the variables so that the congruence is satisfied. For example 
62 +5 = 13 (mod 11) has the unique solution z = 5 (mod 11), that is, all integers 
of the form 11k +5. 
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There is another way to interpret Theorem[.5) which will prove to be the best 
reformulation to understand what happens with quadratic equations: 
Exercise 3.6.4 (The local-global principle for linear equations). Let a,b,c be given non-zero 


integers. There are solutions in integers m,n to am + bn = c if and only if there exist residue 
classes u,v (mod b) such that au + bv = c (mod Bb). 


“Global” refers to looking over the infinite number of possibilities for integer solutions, “local” 
to looking through the finite number of possibilities mod b. This exercise will be revisited in 
exercise [3.9.13 


3.7. Congruences to several moduli 


What are the integers that satisfy given congruences to two different moduli? 


Lemma 3.7.1. Suppose that a, A,b, B are integers. There exists an integer x such 
that both x = a (mod A) andx =b (mod B) if and only if b = a (mod gced(A, B)). 
If there is such an integer x, then the two congruences hold simultaneously for all 
integers x belonging to a unique residue class (mod lcm|[A, B}). 


Proof. The integers x for which « = a (mod A) may be written in the form 
x = Ay+a for some integer y. We are therefore seeking solutions to Ay+a = 2 =b 
(mod B), which is the same as Ay = b—a (mod B). By exercise [3.5.3fa), this has 
solutions if and only gcd(A, B) divides b — a. Moreover exercise [3.5.3(b) implies 
that y is a solution if and only if it is of the form u+n-B/(A, B) for some initial 
solution u and any integer n. Therefore 7 must be of the form 


x= Ayta=A(u+n:-B/(A,B))+a=v+n-lemA, BI, 
where v = Au+a and since A- B/(A, B) = [A, B] by Corollary 


The generalization of this last result is most elegant when we restrict to moduli 
that are pairwise coprime. We prepare with the following exercises: 


Exercise 3.7.1. Determine all integers n for which n = 101 (mod 7!) and n = 101 (mod 13!7), 
in terms of one congruence. 


Exercise 3.7.2. Suppose that a,b,c,... are pairwise coprime integers. 
(a) Prove that if a,b,c,... each divide m, then abc... divides m. 
(b) Deduce that if m =n (mod a) and m =n (mod b) and m =n (mod c), ..., then m=n 
(mod abc...). 


Theorem 3.6 (The Chinese Remainder Theorem). Suppose that mj,...,™Mk 
are a set of pairwise coprime positive integers. For any set of residue classes 


a, (mod mj), a2 (mod mg),...,a,% (mod mx), 
there exists a unique residue class x (mod m) where m = mym2...mx, for which 


x=a; (modm,;) for each j. 


Proof. We can map x (mod m) to the vector (2 (mod m1), x (mod mg),...,x 
(mod m,)). There are mim ...m, different such vectors and each different « mod 
m maps to a different one, for if = y (mod m,) for each j, then z = y (mod m) by 
exercise [3.7.2(b). Hence there is a suitable 1-to-1 correspondence between residue 
classes mod m and vectors, which implies the result. 


3.7. Congruences to several moduli 55 


This is known as the Chinese Remainder Theorem because of the ancient Chinese practice 
(as discussed in Sun Tzu’s 4th-century Classic Calculations) of counting the number of 
soldiers in a platoon by having them line up in three columns and seeing how many are 
left over, then in five columns and seeing how many are left over, and finally in seven 
columns and seeing how many are left over, etc. For instance, if there are a hundred 
soldiers, then there should be 1, 0, and 2 soldiers left over, respectively{)} and the next 
smallest number of soldiers one would need for this to be true is 205 (since 205 is the 
next smallest positive integer = 100 (mod 105)). Presumably an experienced commander 
can eyeball the difference between 100 soldiers and 205! Primary school children in China 
learn a song that celebrates this contribution. 


We can make the Chinese Remainder Theorem a practical tool by giving a 


formula to determine x, given a1, d2,...,@,: Since (m/m,,m,;) = 1 there exists an 
integer b; such that b; -“* = 1 (mod m,) for each j, by Corollary [3.5.2} Then 
J 
(3.7.1) r=ay,b,- die hic ME Asesinas ae (mod m). 
My ms. Mk 


This works because m, divides m/m, for each i 4 j and so 


paper Otgt. eo PS ae ag (mod m,;) 
m3; 
for each j. The 6; can all be determined using the Euclidean algorithm, so x can 
be determined rapidly in practice. 


Exercise 3.7.3.1 Use this method to give a general formula for x (mod 1001) when « = a 
(mod 7), c =b (mod 11), and x =c (mod 13). 


Exercise 3.7.4.1 Find the smallest positive integer n which can be written as n = 2a? = 3b? = 
5c° for some integers a, b,c. 


There is more discussion of the Chinese Remainder Theorem in section [3.14] of 
appendix 3B, in particular in the more difficult case in which the m,;’s have common 
factors: 


Exercise 3.7.5.1 Given residue classes a; (mod m}),...,a% (mod mx) let m = lem[my,..., mp]. 
Prove that there exists a residue class b (mod m) for which b = a; (mod m;) for each j if and 
only if aj =a; (mod (m;,m,;)) for all i ¢ j. 


Moreover in appendix 3C we explain how the Chinese Remainder Theorem can 
be extended to, and understood in, the more general and natural context of group 
theory. 


Exercise 3.7.6. (a) Prove that each of a, b,c,... divides m if and only if lem[a, b, c,...] divides 
m. 
(b) Deduce that ifm =n (mod a) andm =n (mod b)and...,thenm =n (mod Icm{a,b,...]). 
(c) Prove that if b (mod m) in exercise exists, then it is unique. 


Exercise 3.7.7.1 Let M,N,g be positive integers with (M,N,g) = 1. Prove that the set of 
residues {aN +bM (mod g):0<a,b< g—1} is precisely g copies of the complete set of residues 
mod g. 


*Since 100 = 1 (mod 3),=0 (mod 5), and = 2 (mod 7). 
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Exercise 3.7.8. (a) Prove that for any odd integer m there are infinitely many integers n for 

which (n,m) = (n+ 1,m) =1. 

(b) Why is this false if m is even? 

(c) Prove that for any integer m there are infinitely many integers n for which (n,m) = 
(n+ 2,m) =1. 

(d)* Let a, < az <--- < ay be given integers. Give an “if and only if” criterion in terms of the 
a; (mod p), for each prime p dividing m, to determine whether there are infinitely many 
integers n for which (n + a1,m) = (n+ a2,m) =--- = (n+ ag,m) = 1. 


Exercise 3.7.9. Prove that there exist one million consecutive integers, each of which is divisible 
by the cube of an integer > 1. 


3.8. Square roots of 1 (mod n) 
We begin by noting 


Lemma 3.8.1. Jf p is an odd prime, then there are exactly two square roots of 1 
(mod p), namely 1 and —1. 


Proof. If x? = 1 (mod p), then p|(x? — 1) = («—1)(x+ 1) and so p divides either 
x —lora+1 by TheoremB.1] Hence x = 1, or —1 (mod p). 


There can be more than two square roots of 1 if the modulus is composite. 
For example, 1, 3, 5, and 7 are all roots of x? = 1 (mod 8), while 1,4,—4, and -1 
are all roots of x? = 1 (mod 15), and +1,+29,+34,+41 are all square roots of 1 
(mod 105). How can we find all of these solutions? 


By the Chinese Remainder Theorem, x is a root of z? = 1 (mod 15) if and 
only if 2? =1 (mod 3) and 2? =1 (mod 5). But, by Lemma[B.8-1] this happens if 
and only if z= 1 or —1 (mod 3) and«=1or —1 (mod 5). There are therefore 
four possibilities for « (mod 15), given by making the choices 


x=1 (mod3) and x=1 (mod5),~ whichimply x=1 (mod 15); 
x=-1 (mod3) and x=-1 (mod5), which imply x=-—1 (mod 15); 
x=1 (mod3) and x=-1 (mod5), whichimply =4 (mod 15); 
x=-1 (mod3) and x=1 (mod5),  whichimply x=—4 (mod 15), 


the last two giving the less obvious solutions. This proof generalizes in a straight- 
forward way: 


Proposition 3.8.1. [fm is an odd integer with k distinct prime factors, then there 
are exactly 2" solutions x (mod m) to the congruence x? =1 (mod m). 


Proof. Lemma [3.8.1] proves the result for m prime. What if m = p® is a power 
of an odd prime p? If 2? = 1 (mod p*), then p|(z? — 1) = (x — 1)(a + 1) and so 
p divides either x — 1 or x + 1 by Theorem However p cannot divide both, 
or else p divides their difference, which is 2. Now suppose that p does not divide 
x+1. Since p*|(z? — 1) = (x —1)(x + 1) we deduce that p*|(x — 1) by Euclid’s 
Lemma. Similarly, if p does not divide « — 1, then p*|(a +1). Therefore x = —1 or 
1 (mod p°). 
Now, suppose that a is an integer for which 


a*=1 (mod m), 
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where m = pj'...p;" where the p,; are distinct odd primes and the e; > 1. By the 
Chinese Remainder Theorem, this is equivalent to a satisfying 


a = (mod p;’) for 7 =1,2,...,k. 
By the first paragraph, this is, in turn, equivalent to 
a=lor-1 (mod p;’) for 7 =1,2,...,k. 


By the Chinese Remainder Theorem, each choice of a (mod p{"),..., a@ (mod p;*) 
gives rise to a different value of a (mod m) that will satisfy the congruence a? = 1 
(mod m). Therefore there are exactly 2* distinct solutions. 


Proposition B.8.1] is, in effect, an algorithm for finding all of the square roots 
of 1 (mod m), provided one knows the factorization of m. Conversely, in section 
[10-1] we will see that if we are able to find square roots mod m, then we are able 
to factor m. 


Exercise 3.8.1. Prove that if (2,6) = 1, then 2? = 1 (mod 24) without working mod 24. You 
are allowed to work mod 8 and mod 3. 


Exercise 3.8.2. (a) What are the roots of 2? = 1 (mod 2°) for each integer e > 1? (This 
must be different from the odd prime case since ? = 1 (mod 8) has four solutions, 1, 3,5, 7 


(mod 8).) 
(b)* Prove that if m has k distinct prime factors, there are exactly 2*+° solutions x (mod m) 
to the congruence 2? = 1 (mod m), where, if 2°||m, then 6 = 0 if e = 0 or 2, 6 = —1 if 


e=l,andéd=1life>3. 
(c) Deduce that the product of the square roots of 1 (mod 2°) equals 1 (mod 2°) if e > 3. 


Exercise 3.8.3.1 Prove that the product of the square roots of 1 (mod m) equals 1 (mod m), 
unless m = 4 or m = p® or m = 2p® for some power p® of an odd prime p, in which case it equals 
—1 (mod m). 


In Gauss’s 1801 book he gives an explicit practical example of the Chinese Remainder 
Theorem. Before pocket watches and cheap printing, people were more aware of solar 
cycles and the moon’s phases than what year it actually was. Moreover, from Roman times 
to Gauss’s childhood, taxes were hard to collect since travel was difficult and expensive 
and so were not paid annually but rather on a multiyear cycle. Gauss explained how to 
use the Chinese Remainder Theorem to deduce the year in the Julian calendar from these 
three pieces of information: 


e The indiction was used from 312 to 1806 to specify the position of the year in a 
15-year taxation cycle. The indiction is = year + 3 (mod 15). 


e The moon’s phases and the days of the year repeat themselves every 19 years) 
The golden number, which is = year + 1 (mod 19), indicates where one is in that cycle of 
19 years (and is still used to calculate the correct date for Easter). 


e The days of the week and the dates of the year repeat in cycles of 28 years in the 
Julian calender["] The solar cycle, which is = year + 9 (mod 28), indicates where one is 
in this cycle of 28 years. 


1°Meton of Athens, in the 5th century BC, observed that 19 (solar) years is less than two hours 
out from being a whole number of lunar months. 
11Since there are seven days in a week and leap years occur every four years. 
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Taking m1 = 15, m2 = 19, m3 = 28, we observe that 


1 1 m 
bh = = = -2 dl by - — = —2-19- 28 = —1064 
i= 45.98 = a (—2) (mod 15) and by oF 9-28 064, 
1 1 1 m 
bo = = =-=10 d 19 d bo-— =10-15-28 = 42 
2= TER pes (mo ) and be a 0-15-28 00, 
1 1 1 1 m 
b3 = = = = rt=-ll 2 d b3-— = —3135. 
5= 75-19 (4+1)-19  14+19~ 5 Oe ee) ene ae ee 


Therefore if the indiction is a, the golden number is b, and the solar cycle is c, then the 
year is 


= —1064a + 4200b — 3135c (mod 7980). 


Additional exercises 


Exercise 3.9.1. Prove that if 2” — 1 is prime, then n must be prime. 


Exercise 3.9.2. Suppose that 0 < zg < x1 < --- is a division sequence (that is, tm|an whenever 
m|n; see exercise [1.7.22), with rn41 > % whenever n > no (> 1). Prove that if x, is prime for 
some integer n > n2, then n is prime. 


We can apply exercise[3.9.2]to the Mersenne numbers M,, = 2”—1, with no = 1, 
so that if MM, is prime, then n is prime; and to the Fibonacci numbers with no = 2, 
so that if F, is prime, then n is prime or n = 4. 


Exercise 3.9.3. We introduced the companion sequence (yn)n>0 of the Lucas sequence (%n)n>0 
in exercise[0.1.4] Note that y; = a does not necessarily divide yz = a? + 2b. 
(a)? Prove that ym divides yn whenever m divides n and n/m is odd. 
(b) Assume that a > 1 and b > 0. Deduce that if yn is prime, then n must be a power of 2. 
(c) Deduce that if 2” + 1 is prime, then it must be a Fermat number. 


Exercise 3.9.4.7 Prove that the Fundamental Theorem of Arithmetic implies that for any finite 
set of primes P, the numbers log p, p € P, are linearly independenf!2] over Q. 


Exercise 3.9.5.1 Prove that gcd(a, b,c) - lcm[a, b,c] = abc if and only if a, b, and ¢ are pairwise 
coprime. 


Exercise 3.9.6.1 Prove that if a and b are positive integers whose product is a square and whose 
difference is a prime p, then a + b = (p? + 1)/2. 


Exercise 3.9.7. Let p be an odd prime and a, y, and z pairwise coprime, positive integers. 
(a)* Prove that apo = py?—! (mod z — y). 


(b) Deduce that ged( =" ,2—-y)=lorp. 


(This problem is continued in exercise[7.10.6]) 


Exercise 3.9.8. Suppose that f(x) € Z[z] is monic and f(0) = 1. Prove that if r € Q and 
f(r) =0, then r = 1 or -1. 


Exercise 3.9.9 (Another proof that V2 is irrational). Suppose that V2 = a/b where a and b are 
coprime integers, so that a? = 2b?. 

(a) Prove that 3 cannot divide b, and so let c= a/b (mod 3). 

(b) Prove that c? = 2 (mod 3), and therefore obtain a contradiction. 


125, ...;2 are linearly dependent over Q if there exist rational numbers a1,...,a%, which are 


not all zero, such that aj4; +--- + a,x%_% = 0. They are linearly independent over Q if they are not 
linearly dependent over Q. 
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Exercise 3.9.10. (a) Prove that J/2+ V3 is irrational. 
(b) Prove that /a+ V6 is irrational unless a and b are both squares of integers. 


Exercise 3.9.11. Suppose that d is an integer and Vd is rational. 
(a) Show that there exists an integer m such that Vd—m = p/q where 0 < p < q and (p,q) = 1. 
(b) If p40, show that Vd+m= Q/p for some integer Q. 
(c) Use (a) and (b) to establish a contradiction when p # 0. 
(d) Deduce that d = m?. 


Reference on the many proofs that \/2 is irrational 


[1] John H. Conway and Joseph Shipman, Extreme proofs I: The irrationality of /2, Math. Intelligencer 
35 (2013), 2-7. 


We say that N can be represented by the linear form ax + by, if there exist 
integers m and n such that am+bn = N. The representation is proper if (m,n) = 1. 


Exercise 3.9.12.7 In this question we prove that if N can be represented by ax + by, then it 
can be represented properly. Let A = a/(a,b) and B = b/(a,b). Theorem states that if 
N =ar + bs, then all solutions to am + bn = N take the form m = r+kB,n =s—kA for some 
integer k. 

(a) Prove that gcd(m,n) divides N. 

(b) Prove that at least one of A and B is not divisible by p, for each prime p. 

(c) Prove that if p { A, then there exists a residue class k, (mod p) such that p|s — kA if and 
only if k = kp (mod p). Therefore deduce that p{s—kA if k = kp +1 (mod p). Note an 
analogous result if p|A (in which case p{ B). 

(d) Deduce that there exists an integer k such that, for all primes p dividing N, either p does 
not divide r+ kB or p does not divide s — kA (or both). 

(e) Deduce that ifm =r+kB and n=s-—kA, then N is properly represented by am + bn. 


Exercise 3.9.13. Prove the following version of the local-global principle for linear equations 
(exercise 3.6.4): Let a,b,c be given integers. There are solutions in integers m,n to am+bn =c 
if and only if for all prime powers p© (where p is prime and e is an integer > 1) there exist residue 
classes u,v (mod p®) for which au + bv = c (mod p*). 


Exercise 3.9.14. Find all solutions to 5a + 7b = 211 where a and b are positive integers. 


Exercise 3.9.15. Suppose that f(x) € Z[x] and m and n are coprime integers. 
(a) Prove that there exist integers a and b for which f(a) = 0 (mod m) and f(b) =0 (mod n) 
if and only if there exists an integer c for which f(c) = 0 (mod mn), and show that we may 
take c= a (mod m) and c= b (mod n). 
(b) Suppose that pi < po <--- < px are primes. Prove that there exist integers a1,...,a@, such 
that f(a;) = 0 (mod p;) for 1 < i < k if and only if there exists an integer a such that 
f(a) = 0 (mod pi pa... px). 


Adding reduced fractions. A reduced fraction takes the form a/b where a and 
b > 0 are coprime integers. We wish to better understand adding reduced fractions. 


Exercise 3.9.16.' Suppose that m and n are coprime integers. 
(a) Prove that for any integer c there exist integers a and 6 for which mmo 


(b) Prove that there are (unique) positive integers a and b for which 2- = £& 
mn m 
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Exercise 3.9.17. Let m and n be given positive integers. 

(a) Prove that for any integers a and b there exists an integer c for which 
L =I|cm[m, n]. 
For the denominators 3 and 6, with L = 6, we have the example 3 + 7 = 5 
the sum has a denominator smaller than L when written as a reduced fraction. However 
$ + é = é so there are certainly examples with these denominators for which the sum has 
denominator L. 

(b)* Show that lem[m,n] is the smallest positive integer L such that for all integers a and b we 
can write % + » as a fraction with denominator L. (This is why lcm[m,n] is sometimes 
called the lowest (or least) common denominator of the fractions 1/m and 1/n.) 


(c)t Show that if * and 2 are reduced fractions whose sum has denominator less than L, then 


bc 
+> = F where 


iG: 
m 


a case in which 


there must exist a prime power p® such that p°||m and p®||n for which p°t! divides an+bm. 


Appendix 3A. Factoring 
binomial coefficients 
and Pascal’s triangle 
modulo p 


3.10. The prime powers dividing a given binomial coefficient 


Lemma 3.10.1. The power of prime p that divides n! is )>,5,[n/p*]. In other 
words 7 


p prime 


Proof. We wish to determine the power of p dividing n! = 1-2-3---(n—1)-n. If 
p* is the power of p dividing m, then we will count 1 for p dividing m, then 1 for p? 
dividing m,..., and finally 1 for p* dividing m. Therefore the power of p dividing 
n! equals the number of integers m, 1 < m < n, that are divisible by p, plus the 
number of integers m, 1 < m <n, that are divisible by p?, plus ..... The result 
follows as there are [n/p] integers m, 1 < m <n, that are divisible by p’ for each 


j > 1, by exercise L.7.6{c). 


Exercise 3.10.1. Write n = no + nip+---+nqp% in base p so that each n; € {0,1,...,p— 1}. 
(a) Prove that [n/p] = (n — (no + rip +++» + nx_ap*-})) /p®. 
The sum of the digits of n in base p is defined to be sp(n) := no +n +--+: +14. 


n—Sp(n) 


(b) Prove that the exact power of prime p that divides n! is 


Theorem 3.7 (Kummer’s Theorem). The largest power of prime p that divides 
the binomial coefficient Ce) is given by the number of carries when adding a and 
b in base p. 
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Example. To recover the factorization of Ge) we add 6 and 8 in each prime base 
< 14: 


0101 020 11 06 06 06 
1000. 0223 13, lly 08; 083 
1101 112 24 20 13 ll 


We see that there are no carries in base 2, 1 carry in base 8, no carries in base 5, 
1 carry in base 7, 1 carry in base 11, and 1 carry in base 13, so we deduce that 
14) _" 91. 71 1 1 

Coma er sie’ 


Proof. For given integer k > 1, let q = p*. Then let A and B be the least non- 
negative residue of a and b (mod q), respectively, so that 0 < A,B <q —1. Note 
that A and B give the first k digits (from the right) of a and b in base p. If C is 
the first & digits of a+ in base p, then C is the least non-negative residue of a+b 
(mod q), that is, of A+ B (mod q). Now0< A+B < 2¢: 

elf A+B <q, then C = A+B and there is no carry in the kth digit when 
we add a and 0 in base p. 


elfA+B>4q, then C = A+ B—qand so there is a carry of 1 in the kth digit 
when we add a and b in base p. 


We need to relate these observations to the formula in Lemma le The 
k |b 
Oe 
P 


trick comes in noticing that A = a — p* oe , and similarly B = b — and 


C=a+b-p* [34| . Therefore 


a+b a b A+B—C _ J1 ifthere is a carry in the kth digit, 
pk pk pk 0 if not, 


& (S| be) 


equals the number of carries when adding a and 6 in base p. However LemmaJ3.10.1] 
implies that this also equals the exact power of p dividing (e+)! _ Cy, and the 


alb! 
result follows. 


and so 


Exercise 3.10.2. State, with proof, the analogy to Kummer’s Theorem for trinomial coefficients 
n!/(alb!c!) wherea+b+c=n. 


Corollary 3.10.1. If p° divides the binomial coefficient eae then p® <n. 
Proof. There are k + 1 digits in the base p expansion of n when p* <n < p*t!, 
When adding m and n — m there can be carries in every digit except the (k + 1)st 
(which corresponds to the number of multiples of p*). Therefore there are no more 
than k carries when adding m to n— in base p, so that p® < p* < n by Kummer’s 
Theorem. 


Exercise 3.10.3. Prove that if0 <k <n, then (R) divides lem[m : m < n]. 
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3.11. Pascal’s triangle modulo 2 


In section[0.3]we explained the theory and practice of constructing Pascal’s triangle. 
We are now interested in constructing Pascal’s triangle modulo 2, mod 3, mod 4, etc. 
To do so one can either reduce the binomial coefficients mod m (for m = 2,3,4,...) 
or one can rework Pascal’s triangle, starting with a 1 in the top row and then 
obtaining a row from the previous one by adding the two entries immediately above 
the given entry, modulo m. For example, Pascal’s triangle mod 2 starts with the 
rows 


It is perhaps easiest to visualize this by replacing 1 (mod 2) by a dark square and, 
otherwise, a white square, as in the following fascinating diagram? 


. 


A 
Wer" 


(mod 2) 


One can see patterns emerging. For example the rows corresponding to n = 
1,3,7,15,... are all 1’s, and the next rows, n = 2,4,8,16,..., start and end with a 
1 and have all 0’s in between. Even more: The two 1’s at either end of row n = 4 
seem to each be the first entry of a (four-line) triangle, which is an exact copy of 
the first four rows of Pascal’s triangle mod 2, similarly the two 1’s at either end of 
row n = 8 and the eight-line triangles beneath (and including) them. In general 
if Tj, denotes the top 2* rows of Pascal’s triangle mod 2, then Ty41 is given by a 
triangle of copies of T;,, with an inverted triangle of zeros in the middle, as in the 


13This and other images in this section reproduced with kind permission of Bill Cherowitzo. 
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following diagram: 


Figure 3.1. The top 2*+! rows of Pascal’s triangle mod 2, in terms of the 
top 2* rows. 


This is called self-similarity. One immediate consequence is that one can determine 
the number of 1’s in a given row: If 2* < n < 2*+!, then row n consists of two 
copies of row m (:= n — 2") with some 0’s in between. 


Exercise 3.11.1. Deduce that there are 2 odd entries in the nth row of Pascal’s triangle, where 
k = s2(n), the number of 1’s in the binary expansion of n. 


This self-similarity generalizes nicely for other primes p, where we again replace 
integers divisible by p by a white square, and those not divisible by p by a black 
square. 


Ad 


KD AR 


Pascal’s triangle Pascal’s triangle Pascal’s triangle 
(mod 3) (mod 5) (mod 7) 


The top p rows are all black since the entries (") with 0 <<m<n< p—1 are never 
divisible by p. Let Tj, denote the top p* rows of Pascal’s triangle. Then 7j,+1 is 
given by an array of p rows of triangles, in which the nth row contains n copies of 


Tx, with inverted triangles of 0’s in between. 


Pascal’s triangle modulo primes p is a bit more complicated; we wish to color 
in the black squares with one of p — 1 colors, each representing a different reduced 
residue class mod p. Call the top row the Oth row, and the leftmost entry of each 
row its Oth entry. Therefore the mth entry of the nth row is (2) By Lucas’s 


Theorem (exercise 2.5.10) the value of (ie i) (mod p), which is the bth entry of 


the sth row of the copy of 7; which is the ath entry of the rth row of the copies 
of Tj, that make up T,+1, is = (")(;) (mod p). In other words, the values in the 
copy of T, which is the ath entry of the rth row of the copies of T;, are (") times 


the values in Ty. 
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The odd entries in Pascal’s triangle mod 4 make even more interesting patterns, 
but this will take us too far afield; see |1| for a detailed discussion. 


Reading each row of Pascal’s triangle mod 2 as the binary expansion of an 
integer, we obtain the numbers 


1, 11g = 3, 101g = 5, 1111, = 15, 10001, = 17, 110011, = 51, 1010101, = 85,.... 
Do you recognize these numbers? If you factor them, you obtain 


1, Fo, Fi, FoPi Fo, Poko, Fifo, FoFify,... 


where F,,, = 22” +1 are the Fermat numbers (introduced in exercise (0.4.14). It 
appears that all are products of Fermat numbers, and one can even guess at which 
Fermat numbers. For example the 6th row is FF; and 6 = 2?+2! in base 2, whereas 
the 7th row is F)F,Fo and 7 = 2?+2!+2° in base 2, and our other examples follow 
this same pattern. This leads to the following challenging problem: 


Exercise 3.11.2. Show that the nth row of Pascal’s triangle mod 2, considered as a binary 
number, is given by gaan Fn;, where n = 270 4+ 271 4..-.4 2", with O< ng < m1 <--- < ny 
(i.e., the binary expansion of ) 


References for this chapter 
[1] Andrew Granville, Zaphod Beeblebrox’s brain and the fifty-ninth row of Pascal’s triangle, Amer. 
Math. Monthly 99 (1992), 318-331. 


[2] Kathleen M. Shannon and Michael J. Bardzell, Patterns in Pascal’s Triangle - with a Twist - First 
Twist: What is It?, Convergence (December 2004). 


Appendices. The extended version of chapter 3 has the following additional 
appendices: 


Appendix 3B. Solving linear congruences. We develop Gauss’s methods for 
solving linear congruences in several variables with composite moduli. We then 
prove the general form of the Chinese Remainder Theorem. 


Appendix 3C. Groups and rings. We present some of the basics of groups and 
rings and show how the multiplicative and additive groups mod m can be viewed 
in this more abstract way. We also prove the Fundamental Theorem of Abelian 
Groups. 

Appendix 3D. Unique factorization revisited. We discuss various situations in 
which unique factorization works and situations in which it does not. This leads 
us to a discussion of the properties of ideals which allows us to recover a notion of 
unique factorization in all situations. 


Appendix 3E. Gauss’s approach. We review Gauss’s approach to unique fac- 
torization. 


14An m-sided regular polygon with m odd is constructible with ruler and compass (see section 
of appendix OG) if and only if m is the product of distinct Fermat primes. Therefore the integers 
m created here include all of the odd m-sided, constructible, regular polygons. 
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Appendix 3F. The Fundamental theorems and factoring states that a polyno- 
mial of degree d, with coefficients in C, has exactly d roots, counted with multiplic- 
ity. We indicate how to prove this and go on to better understand polynomials and 
their reductions mod m, as well as how resultants tell us how polynomials factor 
mod m. 


Appendix 3G. Open problems. Here we revisit the Frobenius postage stamp 
problem and Egyptian fractions and introduce the 3x + 1 conjecture. 


Chapter 4 


Multiplicative functions 


In the previous chapter we discussed 7(n), which counts the number of divisors of n. 
We discovered that T(n) is a multiplicative function, which allowed us to calculate 
its value fairly easily. Multiplicative functions, so called since 


f(mn) = f(m) f(n) for all pairwise coprime, positive integers m and n, 


play a central role in number theory. (Moreover f is totally multiplicative, or 
completely multiplicative, if f(mn) = f(m)f(n) for all integers m,n > 1.) Thus 
the divisor function, T(r), is multiplicative but not totally multiplicative, since 
T(p*) = a +1, and so r(p?) = 3 is not equal to r(p)? = 2?. Common examples 
of totally multiplicative functions include f(n) = 1, f(n) = n, and f(n) = n° 
for a fixed complex number s. Also Liouville’s function \(n) which equals —1 
to the power of the total number of prime factors of n, counting repetitions of 
the same prime factor. For example (2) = A(3) = A(12) = A(32) = —1 and 
A(4) = A(6) = A(10) = A(60) = 1. 

What makes multiplicative functions central to number theory is that one can 
evaluate a multiplicative function f(n) in terms of the f(p*%) for the prime powers 
p® dividing n. 


Exercise 4.0.1. Show that if f is multiplicative and n = II, atime pp, then 


f(n)= [I fo”). 


p prime 


Deduce that if f is totally multiplicative, then f(n) =|], f(p). 


Exercise 4.0.2. Prove that if f is a multiplicative function, then either f(n) = 0 for all n > 1 or 


fQ) =1. 


Exercise 4.0.3. Prove that if f and g are multiplicative functions, then so is h, where h(n) = 
f(n)g(n) for all n > 1. 


Exercise 4.0.4. Prove that if f is completely multiplicative and d|n, then f(d) divides f(n). 
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Exercise 4.0.5. Prove that if f is multiplicative and a and 6b are any two positive integers, then 


f((a, 6) Fla, bl) = Fla) F(e). 


In this chapter we will focus on two multiplicative functions of great interest. 


4.1. Euler’s ¢-function 
There are 

o(n) := #{m: 1<m<n and (m,n) = 1} 
elements in any reduced system of residues mod n. Obviously ¢(1) = 1. 


Lemma 4.1.1. ¢(n) is a multiplicative function. 


Proof. Suppose that n = mr where (m,r) = 1. By the Chinese Remainder Theo- 
rem (Theorem[3.6) there is a natural bijection between the integers a (mod n) with 
(a,n) = 1 and the pairs of integers (b (mod m),c (mod r)) with (b,m) = (c,r) = 1. 
Since there are ¢(m)@(r) such pairs (b,c) we deduce that ¢(n) = d(m)¢(r). 


Hence to evaluate ¢(n) for all n we simply need to evaluate it on the prime 
powers, by exercise [4.0.1] This is straightforward because (m,p*) = 1 if and only 
if (m, p) = 1; and (m, p) = 1 is not satisfied if and only if p divides m. Therefore 

d(p°) = #{m: 1<m<p* and (m,p) =1} 
= #{m: 1<m<p*}—#{m: 1<m<p* and pi|m} 
1 


— pe _ po 
by exercise [L.7.6[c). We deduce the following: 


Theorem 4.1. Jfn = ue primeP?, then 
e €p—1l e 1 iL 
or) = TT wv) = TT ve (1-2) =n TT (1-3). 
p ie Dp aN p cg 


Example. $(60) = 60- (1— 5) (1— 4) (1— 4) = 16, the least positive residues 
being 
1,7, 11,13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, and 59. 


We give an alternative proof of Theorem [41] based on the inclusion-exclusion prin- 
ciple, in section [4.5] 
Studying the values taken by ¢(n), one makes a surprising observation: 


Proposition 4.1.1. We have }7q),, O(d) =n. 


Example. For n = 30, we have 


(1) + O(2) + O(3) + 65) + 66) + 6(10)+e(15) + e(30) 
= 1414244+24+4+84+8= 30. 
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Proof. Given any integer m with 1 < m < n, let d = n/(m,n), which divides n. 
Then (m,n) = n/d so one can write m = an/d with (a,d) =1and1<a<d. Now, 
for each divisor d of n the number of integers m for which (m,n) = n/d equals the 
number of integers a for which (a,d) = 1 and 1 < a < d, which is ¢(d) by definition. 
We have therefore shown that 


n= #{m: 1<m<n}=S > #{m: 1<m<nand (m,n) = n/d} 


d|n 
=> Fim: m=a(n/d), 1<a<dand (a,d) = 1} 
dln 
=) #{a: 1<a<dand (a,d)=1}= 5° 4(d), 
d\n d\n 


which is the result claimed. 


Exercise 4.1.1. Prove that if d|n, then ¢(d) divides $(n). 


Exercise 4.1.2. Prove that if n is odd and ¢(n) = 2 (mod 4), then n has exactly one prime 
factor (perhaps repeated several times). 


Exercise 4.1.3. Prove that )))<m<n, (m,n)a1™ = r(n)/2 and [J gq), d= nt(n)/2, 


Exercise 4.1.4. (a) Prove that ¢(n?) = nd(n). 
(b) Prove that if (n)|n — 1, then n is squarefree. 
(c) Find all integers n for which ¢(n) is odd. 


Exercise 4.1.5.1 Suppose that n has exactly k prime factors, each of which is > k. Prove that 
o(n) 2 n/2. 


4.2. Perfect numbers. “The whole is equal to the sum of its parts.” 


The number 6 is a perfect number since it is the sum of its proper divisors (the 
proper divisors of m are those divisors d of m for which 1 < d < m); that is, 


6=1+2+3. 


Six is a number perfect in itself, and not because God created all things 
in six days; rather, the converse is true. God created all things in six 
days because the number is perfect. 

— from The City of God by SAINT AUGUSTINE (354-430) 


The next perfect number is 28 = 1+2+4+7+414 which is the number of days in 
a lunar month. However the next, 496 = 1+2+4+4+8+416+4 31+ 62+ 124 4 248, 
appears to have little obvious cosmic relevance. Nonetheless, we will be interested 
in trying to classify all perfect numbers. To create an equation we will add n to 
both sides to obtain that n is perfect if and only if 


2n = o0(n), where o(n) := S- d. 
d\n 


Exercise 4.2.1. Show that a(n) = 37q),n/d, and so deduce that n is perfect if and only if 
1 
Dan asa 2. 
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Exercise 4.2.2. (a) Prove that each divisor d of ab can be written as €m where é|a and m|b. 
(b) Show that if (a,b) = 1, then there is a unique such pair ¢, m for each divisor d. 


By this last exercise we see that if (a,b) = 1, then 
N= y= S- ém=S°e-S > m=a(a)o(b), 
dlab Lia, m|b Lla m|b 


proving that o is a multiplicative function. Now 


k 2 ae ei 
CD) SL De = 
by definition, and so 
pretl — 4 
P 


6 4 3 2 
For example o(2°- 3°-5?-7) = 45). $='. 221. th. 


Euclid observed that the first perfect numbers factor as 6 = 2-3 where 3 = 2?—1 
is prime, and 28 = 2?- 7 where 7 = 2? — 1 is prime, and then that this pattern 
persists: 

Proposition 4.2.1 (Euclid). If 2? —1 is a prime number, then 2?—'(2? — 1) is a 
perfect number. 


The cases p = 2, 3, 5 correspond to the Mersenne primes 2? — 1 = 3, 27-1 = 
7, 2° —1 = 31 and therefore yield the three smallest perfect numbers 6, 28, 496 
(and the next smallest examples are given by p = 7 and p = 13). 


Proof. Since ¢ is multiplicative we have, for n = 2?~1(2? — 1), 
2? —1 
2-1 


a(n) = o(2?-')-o(2? -1) = 


- (1+ (2? —1)) = (2? —1)- 2? = 2n. 


After extensive searching one finds that perfect numbers of the form 2?~1(2?—1) 
with 2?—1 prime appear to be the only perfect numbers. Euler succeeded in proving 
that these are the only even perfect numbers, and we believe (but don’t know) that 
there are no odd perfect numbers. If there are no odd perfect numbers, as claimed, 
then we would achieve our goal of classifying all the perfect numbers. 


Theorem 4.2 (Euclid). Ifn is an even perfect number, then there exists a prime 
number of the form 2? —1 such that n = 2?-1(2P — 1). 


In exercise B.9.1] we showed that if 2? — 1 is prime, then p must itself be prime. 
Now, although 2? — 1, 2? —1,2°—1, and 2’ —1 are all prime, 2'! —1 = 23 x 89 is not, 
so we do not know for sure ‘qliedhee 2? — 1 is prime, even if p is prime. However it 
is conjectured that there are infinitely many Mersenne primes M, = 2? — iff which 
would imply that there are infinitely many even perfect numbers. 


lt is known that 2? — 1 is prime for p = 2,3,5,7,13,17,19,...,82589933, a total of 51 values 
as of September 2019 (and this last is currently the largest prime explicitly known). There is a long 
history of the search for Mersenne primes, from the first serious computers to the first great distributed 
computing project, GIMPS (Great Internet Mersenne Prime Search). 
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Proof. Any even integer can be written as n = 2’~!m where m is odd and k > 2, 
so that if n is perfect, then 


2*m = 2n = o(n) = o(2*~1)a(m) = (2* — 1)a(m). 


Now (2* —1,2*) = 1 and so 2* — 1 divides m. Writing m = (2* —1)M we find that 
a(m) = 2*M =m+M. That is, ¢(m), which is the sum of all of the divisors of m, 
equals the sum of just two of its divisors, namely m and M (and note that these 
are different integers since m = (2* —1)M > (2?-—1)M > M). This implies that 
m and M are the only divisors of m. The only integers with just two divisors are 
the primes, so that m is a prime and M = 1, and the result follows. 


It is widely believed that the only perfect numbers were those identified by 
Euclid, that is, that there are no odd perfect numbers. It has been proved that if 
there is an odd perfect number, then it must be > 101°, and it would have to 
have more than 100 (not necessarily distinct) prime factors. 


Exercise 4.2.3. (a) Prove that if p is odd and k is odd, then o(p*) is even. 
(b)t Deduce that if n is an odd perfect number, then n = pm? where p is a prime that does 
not divide the integer m > 1 and p= ¢=1 (mod 4). 


Exercise 4.2.4. Fix integer m > 1. Show that there are only finitely many integers n for which 
a(n) =m. 


Exercise 4.2.5. (a) Prove that for all integers n > 1 we have the inequalities 


+1 a(n 
II* = <q . 


Pp 
pln p|n 


(b) We have seen that every even perfect number has exactly two distinct prime factors. Prove 
that every odd perfect number has at least three distinct prime factors. 


Additional exercises 


Exercise 4.3.1. Suppose that f(n) = 0 if n is even, f(n) = 1 if n =1 (mod 4), and f(n) = —-1 
if n = —1 (mod 4). Prove that f(.) is a multiplicative function. 


Exercise 4.3.2.' Suppose that r(.) is a multiplicative function taking values in C. Let f(n) =1 
if r(n) £0, and f(n) = 0 if r(n) = 0. Prove that f(.) is also a multiplicative function. 


Exercise 4.3.3.' Suppose that f is a multiplicative function, such that the value of f(n) depends 
only on the value of n (mod 3). What are the possibilities for f? 


Exercise 4.3.4.4 Suppose that f is a multiplicative function, such that the value of f(n) depends 
only on the value of n (mod 8). What are the possibilities for f? 


Exercise 4.3.5. How many of the fractions a/n with 1 <a <n-—1 are reduced? 


Looking at the values of ¢(m), Carmichael conjectured that for all integers m 
there exists an integer n 4 m such that $(n) = o(m). 


Exercise 4.3.6.1 (a) Find all integers n for which 6(2n) = (n). 
(b) Find all integers n for which ¢(3n) = $(2n). 
(c) Can you find other classes of m for which Carmichael’s conjecture is true? 
Carmichael’s conjecture is still an open problem but it is known that if it is false, then the 
smallest counterexample is > 191070. 
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Exercise 4.3.7.1 (a) Given a polynomial f(x) € Z[x] let Ny(m) denote the number of a 
(mod m) for which f(a) = 0 (mod m). Show that Ny(m) is a multiplicative function. 
(b) Be explicit about N¢(m) when f(x) = x? — 1. (You can use section[B.8]) 


Exercise 4.3.8.4 Given a polynomial f(z) € Z[z] let R¢(m) denote the number of b (mod m) 
for which there exists a (mod m) with f(a) = 6 (mod m). Show that Ry(m) is a se a 
function. Can you be more explicit about Ry(m) for f(x) = x, the example of exercise [2.5.6f 


Exercise 4.3.9. Let 7(n) denote the number of divisors of n (as in section [3.3), and let w(n) 
and Q(n) be the number of prime divisors of n not counting and counting repeated prime factors, 
respectively. Therefore 7(12) = 6,w(12) = 2, and (12) = 3. Prove that 


gu(m) < T(n) < 22") for all integers n > 1. 
Exercise 4.3.10. Let o;,(n) = dean d*. Prove that o%(n) is multiplicative. 


Exercise 4.3.11. (a) 
if and only if (a,b) = 

(b) Prove that o,(ab) < op, (eens for all positive integers a, b, and k. 
(c) Prove that op+4¢(n) < ox(n)oe(n) for all positive integers k, €, and n. 


eis that T(ab) < r(a)r(b) for all positive integers a and b, with equality 


Exercise 4.3.12. Give closed formulas for (a)' 7” _, gcd(m,n) and (b)# 3" _, lem(m, n) in 
terms of the prime power factors of n. 


Exercise 4.3.13. n is multiplicatively perfect if it equals the product of its proper divisors. 
(a) Show that n is multiplicatively perfect if and only if r(n) = 4. 
(b) Classify exactly which integers n satisfy this. 


The integers m and n are amicable if the sum of the proper divisors of m equals 
mn and the sum of the proper divisors of n equals m. For example, 220 and 284 are 
amicable, as are 1184 and 1210F 


Exercise 4.3.14. (a) Show that m and n are amicable if and only if o(m) = o(n) =m+n. 
(b) Verify Thabit ibn Qurrah’s 9th-century claim that if p= 3 x 2”-!-—1, q=3x 2"—1, and 
r=9x 2?”-1_ ] are each odd primes, then 2"pq and 2™r are amicable[>| 
(c) Find an example (other than the two given above) using the construction in (b). 


An integer n is abundant if the sum of its proper divisors is > n, for example 
n = 12; and n is deficient if the sum of its proper divisors is < n, for example n = 8. 
Each positive integer is either deficient, perfect, or abundant, a classification that 
goes back to antiquityl4 


Exercise 4.3.15. (a) Prove that every prime number is deficient. 
(b) Prove that every multiple of 6 is abundant. 
) How do these concepts relate to the value of o(n)/n? 
(d) Prove that every multiple of an abundant number is abundant. 
(e)t Prove that if n is the product of k distinct primes that are each > k, then n is deficient. 
(f) Prove that every divisor of a deficient number is deficient. 


?The 14th-century scholar Ibn Khladun claimed: “Experts on talismans assure me that these 
numbers have a special influence in establishing strong bonds of friendship between individuals ... A 
bond so close that they cannot be separated. The author of the Ghaia, and other great masters in this 
art, swear that they have seen this happen again and again.” 

’This was rediscovered by Descartes in the 17th century. 

4 Specifically a book by Nichomachus from A.D. 100. Another interesting reference is the 10th- 
century German nun Hrotsvitha who depicts the heroine of her play “Sapientia” challenging Emperor 
Hadrian while he is persecuting Christians, to surmise the ages of her children from information about 
this classification and the number of Olympic games that each has been alive for! 
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Carl André is a controversial minimalist artist, his most infamous work being his 
Equivalent I-VIII series exhibited at several of the world’s leading museums. Each 
of the eight sculptures involves 120 bricks arranged in a different rectangular for- 
mation. In Equivalent VIII, at the Tate Modern in London, the bricks are stacked 2 
deep, 6 wide, and 10 long. (See http://thesingleroad.blogspot.co.uk/2011/01/test- 
post.html for a photo of the original eight formations.) 


Exercise 4.3.16. (a) How many different 2-deep, 120-brick, rectangular formations are there? 
(b) What if there must be at least three bricks along the width and along the length? 


Appendix 4A. More 
multiplicative functions 


4.4. Summing multiplicative functions 


We have already seen that the functions 1, n, ¢(n), a(n), and r(n) are all multi- 
plicative. In Proposition [4.1.1] we saw the surprising connection that n is the sum 
of the multiplicative function ¢(d), summed over the divisors d of n. Similarly r(n) 
is the sum of 1, and o(n) is the sum of d, summed over the divisors d of n. This 
suggests that there might be a general such phenomenon. 


Theorem 4.3. For any given multiplicative function f, the function 
F(n) => F@ 
din 
is also multiplicative. 
Proof. Suppose that n = ab with (a,b) = 1. In exercise [4.2.2] we showed that the 


divisors of n can be written as £m where é|a and m|b. Note that (¢,m) = 1 since 
(£,m) divides (a,b) = 1 and so f(€m) = f(£)f(m). Therefore 


F(ab) = S> fd) =S— f(ém) = S° fOS— f(m) = F(a) F(), 


dlab Lla Lla mb 


as desired. 


It is worth noting that if we write m = n/d, then Theorem [4.3] becomes 
F(n):= 5° f(n/m). 


min 


Above we have the examples {F'(n), f(d)} = {n, o(d)}, {r(n), 1}, {o(n), d}; 
but what about for other F(n)? For F(n) = 1 we have 1 = }74,,4(d) where 
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d(d) = 1 if d = 1, and = 0 otherwise. Finding f when Fn) = d(n) looks more 
complicated. This leads us to two questions: For every multiplicative F’, does there 
exist a multiplicative f for which F'n) := )7q),, f(d)? And, if so, is f unique? To 
answer these questions we begin by defining another multiplicative function which 
arises in a rather different context. 

Exercise 4.4.1. Prove that mC = aln 


d squarefree 


1 
e(d)* 


4.5. Inclusion-exclusion and the Mobius function 


a 


In the proof of Theorem we saw that if n = p* is a prime power, then ¢(n) is 
the total number of integers up to n, minus the number of those that are divisible 
by p. This leads to the formula 


ao) a-t (0-1) 


Similarly if n = p%q’, then we wish to count the number of positive integers up to 
n that are not divisible by either p or g. To do so we take the n integers up to n, 
subtract the n/p that are divisible by p and the n/q that are divisible by g. This 
is not quite right as we have twice subtracted the n/pq integers divisible by both p 
and q, and so we need to add them back in. This leads to the formula 


d(n) = eae) Lae n(1-2) (1-2). 
P qd pq Pp q 


This argument generalizes to arbitrary n, though we need to keep track of the terms 
of the form +n/d. In our examples so far, we see that each such d is a divisor of 
n, but the term n/d only has a non-zero coefficient if d is squarefree. When d is 
squarefree the coefficient is given by (—1)“( where 


w(d) := S- 1 


p prime, p|d 


is the number of distinct prime factors of d. One therefore deduces that the coeffi- 
cient of n/d is always given by the Mobius function, u(d), a multiplicative function 
defined by 


u(p) = —1, with p(p*) = 0 for all k > 2, for every prime p. 


For example p(1) = 1, u(2) = w(3) = —1, w(4) = 0, w(6) = w(10) = 1, and 
(1001) = —1 as 1001 =7x 11 x 13. 


The argument for general n uses the inclusion-exclusion principle, which we 
formulate here to fit well with the topic of multiplicative functions. 


Corollary 4.5.1. We have 


5 y(a) = ifn=1, 
dln 0 


otherwise. 


Proof. The result for n = 1 is trivial. If n is a prime power p® with a > 2, then 
Dalp2 H(d) = 14+ (—1) + 04+---+0=0 by definition. 


The result for general n then follows from Theorem [4.3] 
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Exercise 4.5.1. (a) Show that if m is squarefree, then 
Gey) = Fae). 


d|m 
(b) Deduce Corollary [4.5.1] 
A proof of Theorem [4.1] using the inclusion-exclusion principle. We want 


a function that counts 1 if (a,n) = 1 and 0 otherwise. This counting function can 
be given by Corollary [£51] 


1 if (a,n) =1, 
» pa) = {6 otherwise 
dja & d\n : 


Therefore 


a=1 


> ae 


a=l1dla & d\n 


Lud 1 = Va San 


d|n l<a<n d|n d\n 
dla 


jl if(an)=1, 
ae ft otherwise. 


The last line comes from first swapping the order of summation and then using 
exercise [L.7.6[c) as [n/d] = n/d since each d divides n. Exercise completes 
the proof. 


Exercise 4.5.2. Prove that for any positive integer n we have 


er = LL b=5): 


Exercise 4.5.3. Prove that p(n)? is the characteristic function for the squarefree integers, and 


2 
deduce that aa = Val way 


4.6. Convolutions and the Mobius inversion formula 


In the proof of Theorem [4.1Jin the last section we saw that 


o(n) = So nla). 
dln 


If we let r = n/d, then the sum is over all factorizations of n into two positive 
integers n = dr, and so 


d,r>1 
n=dr 


This can be compared to Proposition [4.1.1] which yielded 


n= > ¢(d)= S— o(d)1(r), 


d\n dro 
n=dr 
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where 1(r) is the function that is always 1 (which is a multiplicative function). 
Something similar happens for the sum of any function f defined on the positive 
integers. 


Theorem 4.4 (The Mobius inversion formula). For any two arithmetic functions 
f and g we have 


= S- f(b) for all integers n> 1 


ab=n 


if and only if 
f(m) = > u(c)g(d) for all integers m > 1. 


cd=m 


This can be rewritten as 
= S— f(d) for all n > 1 if and only if f(m) = S~ u(m/d)g(d) for all m > 1. 
d|m 


Proof. If f(m) = om H(c)g(d) for all integers m > 1, then 


YS fO= SV wOg@= YS ul = 2) - S72 le) = 9(n), 


ab=n ab=n cd=b acd=n ac=n/d 


since this last sum is 0 unless n/d = 1, that is, unless d = n. Similarly if g(n) = 
Yo apen J (8) for all integers n > 1, then 


S? uog(@) = S> ule) SS fH) = So us = 21) S> ule) = f(m), 


cd=m cd=™ ab=d abc=m ac=* 


as desired. 


In the discussion above we saw several examples of the convolution f * g of two 
multiplicative functions f and g, which we define by 


(fxg)(n):= > flag 


ab=n 


Note that f * g =g* f. We saw that if I(n) =n, then 6* 1 =J and w*I = ¢, as 
well as 1* ps = 0. 


Exercise 4.6.1. Prove that 6 *« f = f for all f, 7 =1*1, and o(n) =1* I. 


Proposition 4.6.1. For any two multiplicative functions f and g, the convolution 
f xg ts also multiplicative. 
Exercise 4.6.2. Prove that if ab = mn, then there exist integers r,s,t,u with a = rs, b = 


tu, m=rt, n= su with (s,t) =1. 


Proof. Suppose that (m,n) = 1. For h = f * g, we have 


S> f(a)g(b) 


ab=mn 
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We use exercise and note that (r,s) and (t, wu) both divide (m,n) = 1 and 
so both equal 1. Therefore f(a) = f(rs) = f(r) f(s) and g(b) = g(tu) = g(t)g(u). 
This implies that 


h(mn) = So f(rs)g(tu) = So f(r)g(t) SF Ff (s)g(u) = h(m)h(n). 


In this new language, Theorem [4.3] which states that 1 * f is multiplicative 
whenever f is, is the special case g = 1 of Proposition [Z6.1] Corollary [4.5.1] states 
that 1 * 4 = 6. The Mobius inversion formula states that f = 1 f if and only if 
f =pu*F. It is also easy to prove the M6bius inversion formula with this notation 
since if F = 1x* f, then ux F =pwxlx f =dx* f = f; andif f = wx F, then 
le fH=lxepxP=do*xF=F, 


Exercise 4.6.3. Prove that (wu *o)(n) =n for all integers n > 1. 


Exercise 4.6.4. (a) Show that (a* f)+(b* f) = (a+b) x f. 
(b) Let f(n) > 0 for all integers n > 1. Prove that (1* f)(n)+(y* f)(n) > 2f(n) for all integers 
ne 1. 
(c) Prove that o(n) + ¢(n) > 2n for all integers n > 1. 


Exercise 4.6.5. Suppose that g(n) = [T4), f(a). Deduce that f(n) = Tan 9(aye er). 


4.7. The Liouville function 


The number of prime factors of a given integer n = i p;’ can be interpreted in 
two different ways: 


w(n) = S- 1 = #{distinct primes that divide n} =k 
pin 


and 
k 
Q(n) := S- 1 = #{distinct prime powers that divide n} = S- ej. 
p prime, k>1 i=l 
p*|n 

In other words, Q(n) counts the number of primes when one factors n into primes 
without using exponents, so (12) = 3 as 12 = 2 x 2 x 3, while w(n) counts the 
number of primes when one factors n into primes using exponents, so w(12) = 2 as 
12 = 27-3. For other examples, w(27) = 1 with (27) = 3, and w(36) = 2 with 
Q(36) = 4, while Q(105) = w(105) = 3. 

Another interesting multiplicative function is Liouville’s function, defined at 
the start of this chapter by A(n) = (—1)°™ so that, for example, \(12) = (—1)? = 
—1. We notice that A is the totally multiplicative function that agrees with p on 
the squarefree integers. Liouville’s function feels, intuitively, more natural, but 
Mobius’s function fits better with the theory. 


Exercise 4.7.1. Prove that Q(n) > w(n) for all integers n > 1, with equality if and only if n is 
squarefree. 


Exercise 4.7.2. Prove that \* 2 = 6. 
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Exercise 4.7.3. (a) Prove that 


yo Xa) = {i if n is a square, 


0 otherwise. 
d|n 


(b)* By summing the formula in (a) over all positive integers n < N, deduce that for all integers 


N > 1 we have 
> An) *] = [VN]. 
n>1 


Additional exercises 


Exercise 4.8.1. Prove that u(n)u(n + 1)pu(n + 2)u(n + 3) = 0 for all integers n > 1. 
Exercise 4.8.2. Prove that ¢(n) + o(n) = 2n if and only if n = 1 or n is a prime. 


Exercise 4.8.3. (a) By summing the formula in Corollary over all positive integers n < 


N, deduce that 


So u(r) =] =1 forall N>1. 


(b)* Deduce that 


> HOO) | 4 toe a ES 1, 
n<N m 


It is a much deeper problem to prove that >>, <j p(n)/n tends to a limit as N — oo. 


Exercise 4.8.4. (a) Prove that if f is an arithmetic function, then 


YAMA = Tae Nemo”, 
n>1 me>1 


without worrying about convergence. 
(b) Write out explicitly the example f = uz as well as some other common multiplicative func- 
tions. 


Appendices. The extended version of chapter 4 has the following additional 
appendices: 


Appendix 4B. Dirichlet series and multiplicative functions. We discuss the 
construction of Dirichlet series, establishing that they have an Euler product if the 
coefficients are given by a multiplicative function. We discuss convergence issues, 
interesting Dirichlet series, and important identities. 


Appendix 4C. Irreducible polynomials mod p. We develop, in part, the analogy 
to the theory of this chapter when working with polynomials mod p, giving a formula 
for the number of irreducibles. 


Appendix 4D. The harmonic sum and the divisor function. We develop upper 
and lower bounds for the sum of 1/n over positive integers n < N and use these 
to determine a good estimate for the average number of divisors of an integer. We 
develop Dirichlet’s hyperbola method to get a spectacularly accurate estimate for 
this average. 


Appendix 4E. Cyclotomic polynomials. We introduce their properties which 
will come in useful in several areas, later on. 


Chapter 5 


The distribution 
of prime numbers 


Once one begins to determine which integers are primes, one quickly finds that 
there are many. Are there infinitely many? One notices that the primes seem to 
make up a decreasing proportion of the positive integers. Can we explain this? Can 
we give a formula for how many primes there are up to a given point? Or at least 
give a good estimate? 


When we write out the primes there seem to be patterns, though the patterns 
rarely persist for long. Can we find patterns that do persist? Is there a formula 
that describes all of the primes? Or at least some of them? 

Is it possible to recognize prime numbers quickly and easily? 


These questions motivate different parts of this chapter and of chapter 10. 


5.1. Proofs that there are infinitely many primes 


The first known proof appears in Euclid’s Elements, Book IX, Proposition 20: 


Theorem 5.1. There are infinitely many primes. 


Proof #11] Suppose that there are only finitely many primes, which we will denote 
by 2=p, < pg =3 <--+ < px. What are the prime factors of pjpo...pzp-+1? Since 
this number is > 1 it must have a prime factor by the Fundamental Theorem of 
Arithmetic, and this must be p; for some 7, 1 < j < k, since all primes are contained 
amongst p1,p2,...,Pk- But then p; divides both pip2...pz and pip2...pe+1, and 
hence p; divides their difference, 1, by exercise [.L-I{c), which is impossible. 


1Not until relatively recently has there been mathematical notation to describe a collection of 
objects, for example, p1,p2,-..,px. Neither Euclid nor Fermat had subscripts or “...” or “etc.” (Gauss 
used “&c”). So instead the reader had to infer from the context how many objects the author meant. In 
Euclid’s Elements, he writes that he assumes a, 3, y denote all of the prime numbers and then gives, in 
terms of ideas, the same proof as here. The reader had to understand that in writing “a, 8, y”, Euclid 
meant an arbitrary number of primes, not just three! 
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There are many variants on Euclid’s proof. For example: 


Exercise 5.1.1 (Proof #2). Suppose that there are only finitely many primes, the largest of 
which is n > 2. Show that this is impossible by considering the prime factors of n! — 1. 


Other variants include Furstenberg’s curious proof using point-set topology (see 
appendix 5F). These all boil down to showing that there exists an integer g > 1 that 
is not divisible by any of a given finite set of primes p1,...,px. If m = pip2--: pr, 
then we wish to show there exists an integer q > 1 with (g,m) = 1, and there are 
@(m) — 1 such integers up to m. There is therefore such an integer by the formula 
in Theorem [4.1] once m > 2. 


Exercise 5.1.2. Prove that there are infinitely many composite numbers. 


Euclid’s proof that there are infinitely many primes is a “proof by contradic- 
tion”, showing that it is impossible that there are finitely many, and so does not 
suggest how one might find infinitely many. We can use the following constructive 
technique to determine infinitely many primes: 


Lemma 5.1.1. Suppose that a, < ag < -:: is an infinite sequence of pairwise 
coprime positive integers, and let py, be a prime factor of a, for eachn > 2. Then 
p2,p3,--. 1s an infinite sequence of distinct primes. 


Proof. If pm = pn with 1 <m <n, then p,, divides both a,, and a, and so divides 
(@m,@n) = 1, which is impossible. 


By Lemmal|5.1.1}we need only find an infinite sequence of pairwise coprime pos- 
itive integers to obtain infinitely many primes. This can be achieved by modifying 
Euclid’s construction. We define the sequence 


a, = 2, a2 =3 and then ay, = a,a2...dn—1 +1 for each n > 2. 


Now if m < n, then a, = 1 (mod a,,) and so (@m,@n) = (@m,1) = 1 by exercise 
2.1.5] as desired. Therefore, by Lemma[5.1.1] we can take a prime factor py, of each 
apy, with n > 1 to obtain an infinite sequence of prime numbers. 


Fermat conjectured that the integers F,, = 2?” +1 are primes for all n > 0. 
His claim starts off correct: 3,5, 17,257, 65537 are all prime, but his conjecture is 
false for Fs = 641 x 6700417, as Euler famously noted. It_is an open question as to 
whether there are infinitely many primes of the form FE Using the identity 


(5.1.1) F, = F\ Fo... Fn—1 +2 for each n > 1 


we see that ifm <n, then F, = 2 (mod F,,) so that (Fm, Fn) = (Fm, 2) = 1, the 
last equality since each F,, is odd. Therefore, by LemmaJ5.1.1] we can take a prime 
factor py, of each F,, to obtain an infinite sequence of prime numbers|? 


These proofs that there are infinitely many primes will be generalized using 
dynamical systems in appendix 5H. 


?The only Fermat numbers known to be primes have n < 4. We know that the F,, are composite 
for 5 < n < 30 and for many other n besides. It is always a significant moment when a Fermat number 
is factored for the first time. It could be that all F, with n > 4 are composite or they might all be 
prime from some sufficiently large n onwards or some might be prime and some composite. Currently, 
we have no way of knowing which is true. 

3This proof that there are infinitely many primes first appeared in a letter from Goldbach to Euler 
in July 1730. 
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Exercise 5.1.3. Prove (5.1.1). 


Exercise 5.1.4. Suppose that pi = 2 < pp =3<--- is the sequence of prime numbers. Use the 
fact that every Fermat number has a distinct prime divisor to prove that pn < 22" +1. What can 
one deduce about the number of primes up to x? 


Exercise 5.1.5. (a) Show that if m is not a power of 2, then 2” +1 is composite by showing 
that 2% + 1 divides 2° + 1 whenever b is odd. 
(b) Deduce that if 2” +1 is prime, then there exists an integer n such that m = 2”; that is, if 


2™ + 1 is prime, then it is a Fermat number Fy, = 22" 4.4, (This also follows from exercise 
3.9.31 b).) 


Another interesting sequence is the Mersenne numbers[4| which take the form 
M, = 2” —1. After exercise [3.9.2] we observed that if M,, is prime, then n is 
prime and, in our discussion of perfect numbers (section [4.2) we observed that 
M2, M3, Ms, and M7 are each prime but My, = 23 x 89 is not. The Lucas-Lehmer 
test provides a relatively quick and elegant way to test whether a given M, is prime 
(see Corollary [[0.10.1]in appendix 10C). 


5.2. Distinguishing primes 


We can determine whether a given integer n is prime in practice, by proving that 
it is not composite: If a given integer n is composite, then we can write it as ab, 
two integers both > 1. If we suppose that a < b, then a? < ab =n and so a < V/n. 
Hence n must be divisible by some integer a in the range 1 < a < \/n. Therefore 
we can test divide n by every integer a in this range, and we either discover a factor 
of n or, if not, we know that n must be prime. This process is called trial division 
and is too slow, in practice, except for relatively small integers n. We can slightly 
improve this algorithm by noting that if p is a prime dividing a, then p divides n, 
so we only need to test divide by the primes up to \/n. This is still very slow, in 
practice|? We discuss more practical techniques in chapter 10. 


Trial division is a very slow way of recognizing whether an individual integer is 
prime, but it can be organized to be a highly efficient way to determine all of the 
primes up to some given point, as observed by Eratosthenes around 200 B.C 


The sieve of Eratosthenes provides an efficient method for finding all of the 
primes up to x. For example to find all the primes up to 100, we begin by writing 
down every integer between 2 and 100 and then deleting every composite even 
number; that is, one deletes (or sieves out) every second integer up to x after 2. 


4M 1640, France was home to the great philosophers and mathematicians of the age, such as 
Descartes, Desargues, Fermat, and Pascal. From 1630 on, Father Marin Mersenne wrote letters to all of 
these luminaries, posting challenges and persuading them all to think about perfect numbers. 

5 How slow is “slow”? If we could test divide by one prime per second, for a year, with no rest, 
then we could determine the primality of 17-digit numbers but not 18-digit numbers. If we used the 
world’s fastest computer in 2019, we could test divide 53-, but not 54-, digit numbers. In chapter 10 we 
will encounter much better methods that can test such a number for primality, in moments. 

°Eratosthenes lived in Cyrene in ancient Greece, from 276 to 195 B.C. He created the grid system 
of latitude and longitude to draw an accurate map of the world incorporating parallels and meridians. 
He was the first to calculate the circumference of the earth, the tilt of the earth’s axis, and the distance 
from the earth to the sun (and so invented the leap day). He even attempted to assign dates to what 
was then ancient history (like the conquest of Troy) using available evidence. 


84 5. The distribution of prime numbers 


2 3 5 7 9 
11 13 #150 «61719 
21 23 250 (27 -~——29 
3l 33 385887839 
Al 43 45 47 49 
51 53 55 «0557 PD 
61 63 65 67 69 
71 73 75 77 79 
81 83 85 87 89 
91 93 95 97 99 


Deleting every even number > 2, between 2 and 100 


The first undeleted integer > 2 is 3; one then deletes every composite integer 
divisible by 3; that is, one sieves out every third integer up to x after 3. The next 
undeleted integer is 5 and one sieves out every fifth integer after 5, and then every 
seventh integer after 7. 


2 83 5 7 2 3 #5 7 
11 13 17 19 11 13 17 19 
23-25 29 23 29 
31 35 37 31 37 
Al 43 47 49 Al 43 47 
5355 59 53 59 
61 65 67 61 67 
fall 73 77 ~=©°79 71 73 79 
83 85 89 83 89 
91 95 97 97 
Then delete remaining integers > 3 and > 5 that are divisible by 5 
that are divisible by 3 and > 7 that are divisible by 7. 


The sieve of Eratosthenes enables us to find all of the primes up to 100. 


What’s left are the primes up to 100. To obtain the primes up to any given limit 
xz, one keeps on going like this, finding the next undeleted integer, call it p, which 
must be prime since it is undeleted, and then deleting every pth integer beyond p 
and up to x. We stop once p > \/z and then the undeleted integers are the primes 
< «a. There are about x log log x stepd]| in this algorithm, so it is a remarkably 
efficient way to find all the primes up to some given zi but not for finding any 
particular prime. 


Exercise 5.2.1. Use this method to find all of the primes up to 200. 


The number of integers left after one removes the multiples of 2 is roughly 4 “2, 
since about half of the integers up to x are divisible by 2. After one then removes 


“How should one think about an expression like loglogz? It goes to co as a does, but it is a 
very slow growing function of a. For example, if 2 = 101°°, far more than the current estimate for the 
number of atoms in the universe, then loglogz < 5S. Dan Shanks once wrote that “loglog x goes to 
infinity with great dignity.” 

Stn practice, this algorithm determines which of the first x integers are prime in no more than 6x 
steps. 
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the multiples of 3, one expects that there are about 2 - «x integers left, since 
about a third of the odd integers up to x are divisible by 3. In general removing 
the multiples of p removes, we expect, about 1/p of the integers in our set and so 
leaves a proportion 1 — 4. Therefore we expect that the number of integers left 


w 
NI 


unsieved in the sieve of Eratosthenes, up to x, after sieving by the primes up to y, 


is about 
Il 1 
psy 
p prime 


The product [],,<,(1— a) i is well-approximated by e~7/ log y, where ¥ is the Euler- 
Mascheroni constant discussed | in section[4.14]of appendix 4D) The logarithm, used 
here and elsewhere in this book, is the natural logarithm. 

When we take y = \/z, then only 1 and the primes up to x should be left in the 
sieve of Eratosthenes, and so one might guess from this analysis of sieve methods 
that the number of primes up to x is approximately 

x 


5.2.1 2e7 : 
( ) © log x 


This guess is not correct; the constant is off [9 as we will discuss in section [5.4] 


5.3. Primes in certain arithmetic progressions 


How are the primes split between the arithmetic progressions modulo 3? Or modulo 
4? Or modulo any given integer m? Evidently every integer in the arithmetic 
progression 0 (mod 3) (that is, integers of the form 3k) is divisible by 3, so the 
only prime in that arithmetic progression is 3 itself. There are no such divisibility 
restrictions for the arithmetic progressions 1 (mod 3) and 2 (mod 3) and if we 
partition the primes up to 100 into these arithmetic progressions, we find: 


Primes = 1 (mod 3): 7,13, 19, 31,37, 43, 61, 67, 73, 79, 97,.... 
Primes = 2 (mod 3): 2,5, 11,17, 23, 29, 41, 47, 53, 59, 71, 83, 89,.... 


There seem to be a lot of primes in each arithmetic progression, and they seem to 
be roughly equally split between the two. Let’s see what we can prove. First let’s 
deal, in general, with the analogy to the case 0 (mod 3). This includes not only 0 
(mod m) but also cases like 2 (mod 4): 


Exercise 5.3.1. (a) Prove that any integer = a (mod m) is divisible by (a,m). 
(b) Deduce that if (a,m) > 1 and if there is a prime = a (mod m), then that prime is (a, m). 
(c) Give examples of arithmetic progressions which contain exactly one prime and examples 
which contain none. 
(d) Show that the arithmetic progression 2 (mod 6) contains infinitely many prime powers. 


Therefore all but finitely many primes are distributed among the ¢(m) arith- 
metic progressions a (mod m) with (a,m) = 1. How are they distributed? If the 
m = 3 case is anything to go by, it appears that there are infinitely many in each 


°This is a fact that is beyond the scope of this book but will be discussed in |Graa]|. In fact 
e 7 = .56145948 . 
Dirheash ert by much. The correct constant is 1 whereas 2e~ 7% = 1.12291896.... 
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such arithmetic progression, and maybe even roughly equal numbers of primes in 
each up to any given point. 


We will prove that there are infinitely many primes in each of the two feasible 
residue classes mod 3 (see Theorems .2] and [7.7). 


Theorem 5.2. There are infinitely many primes = —1 (mod 3). 


Proof. Suppose that there are only finitely many primes = —1 (mod 3), say 
P1,P2,---,;Pk- The integer N = 3p,p2...px — 1 must have a prime factor gq = —1 
(mod 3), by exercise [5.3.2] However g divides both N and N + 1 (since it must be 
one of the primes p;), and hence q divides their difference, 1, which is impossible. 


Exercise 5.3.2. Use exercise[3.1.4[a) to show that if m = —1 (mod 3), then there exists a prime 
factor p of n which is = —1 (mod 3). 


In 1837 Dirichlet showed that whenever (a,q) = 1 there are infinitely many 
primes = a (mod q). (We discuss this deep result in sections of appendix 
8D and [73.7]) In fact there are roughly equally many primes in each of these 
arithmetic progressions mod q. For example, half the primes are = 1 (mod 3) and 
half are = —1 (mod 3), as our data above suggested. Roughly 1% of the primes are 
= 69 (mod 101) and indeed there are roughly 1% of the primes in each arithmetic 
progression a mod 101 with 1 < a < 100. This is a deep result and will be discussed 
at length in our book [Graal]. 


Exercise 5.3.3. Prove that there are infinitely many primes = —1 (mod 4). 
Exercise 5.3.4. Prove that there are infinitely many primes = 5 (mod 6). 


Exercise 5.3.5.1 Prove that at least two of the arithmetic progressions mod 8 contain infinitely 
many primes. 


Exercise |5.8.6] generalizes these results considerably, using similar ideas. 


5.4. How many primes are there up to x? 


When people started to develop large tables of primes, perhaps looking for a pattern, 
they discovered no patterns but did find that the proportion of ae that are 
prime is gradually diminishing (which will be proved in section of pppendis 
5B). In 1808 Legendre quantified this, suggesting that there are an primes 


mee 


up to a4 A few years earlier, aged 15 or 16, Gauss had already made a much better 
guess, based on studying tables of primes: 


In 1792 or 1793 ... I turned my attention to the decreasing frequency 
of primes ... counting the primes in intervals of length 1000. I soon 
recognized that behind all of the fluctuations, this frequency is on average 
inversely proportional to the logarithm .... 

— from a letter to ENCKE by K. F. Gauss (Christmas Eve, 1849) 


11 And even the more precise assertion that there exists a constant B such that (a), the number 
of primes up to 2, is well-approximated by x/(log x — B) for large enough x. This turns out to be true 
with B = 1, though this was not the value for B suggested by Legendre (who presumably made a guess 
based on data for small values of x). 
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His observation may be best phrased as 
About 1 in log x of the integers near x are prime, 


which is (subtly) different from Legendre’s assertion: Gauss’s observation suggests 
that a good approximation to the number of primes up to x is )>"_, ore Ss a 
does not vary much for ¢ between n and n+ 1, Gauss deduced that 7(a) should be 


well-approximated by 


(5.4.1) LS 
_ 9 logt 


We denote this quantity by Li(a) and call it the logarithmic integral|) The loga- 
rithm here is again the natural logarithm. Here is a comparison of Gauss’s predic- 


tion with the actual count of primes up to various values of x: 


x a(x) = #{primes < x} Overcount: Li(a) — (x) 
108 168 10 
104 1229 17 
10° 9592 38 
10° 78498 130 
10” 664579 339 
108 5761455 754 
10° 50847534 1701 
101° 455052511 3104 
101! 4118054813 11588 
10? 37607912018 38263 
1013 346065536839 108971 
1014 3204941750802 314890 
101 29844570422669 1052619 
101° 279238341033925 3214632 
1017 2623557157654233 7956589 
1018 24739954287740860 21949555 
109 234057667276344607 99877775 
107° 2220819602560918840 222744644 
107! 21127269486018731928 597394254 
107? 201467286689315906290 1932355208 
1073 | 1925320391606803968923 7250186216 
1074 | 18435599767349200867866 17146907278 
107° | 176846309399143769411680 55160980939 


Primes up to various x and the overcount in Gauss’s prediction. 


Gauss’s prediction is amazingly accurate. From the data, Gauss’s prediction seems 
to overcount by a small amount, for all « > 84 To quantify this “small amount”, 
we observe that the last column (representing the overcount) is always about half 
the width of the central column (representing the number of primes up to x), so 
these data suggest that the difference is no bigger than a small multiple of \/z. 


12Some authors begin the integral defining Li(z) at « = 0. This adds complication since the 
integrand equals co at x = 1; nonetheless this can be handled, and the difference between the two 


definitions is then the constant 1.045163..., which has little relevance to our discussion. 
131¢ is not true that Li(z) > (a) for all « > 2 but the first counterexample is far beyond where 


we can hope to calculate. Understanding how we know this is well beyond the scope of this book, but 


see |Graal. 
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This might be optimistic but, at the very least, the ratio of (a), the number of 
primes up to x, to Gauss’s guess, Li(x), should tend to 1 as x — co; that is, 


n(x) / Lift) +1 as x00. 


In exercise 5.8.11] we show that Li(x) j iogz ~? 1 as x > oo, and combining these 


last two limits, we deduce that 


na) | ie >1loas tO. 
x 


The notation of limits is cumbersome; it is easier to write 


(5.4.2) n(x) 


x 


- log x 


as x — 00, “z(x) is asymptotic to x/ log a” [] This is different by a multiplicative 
constant from (6.2.1), our guesstimate based on the sieve of Eratosthenes. Our 
data makes it seem more likely that the constant 1 given here, rather than the 
2e~7 given in (5.2.1), is the correct constant. 


The asymptotic (5.4.2) is called the prime number theorem. Its proof came 
in 1896, more than 100 years after Gauss’s guess, involving several remarkable 
developments. It was a high point of 19th-century mathematics and there is still no 
straightforward approach. The main reason is that the prime number theorem can 
be shown to be equivalent to a statement about zeros of the analytic continuation 
of a function (the Riemann zeta-function which we discuss in appendices 4B, 5B, 
and 5D), which seems preposterous at first sight. Although proofs can be given 
that avoid mentioning these zeros, they are still lurking somewhere just beneath 
the surface|*)| A proof of the prime number theorem is beyond the scope of this 
book (but see and [GS]). 


Exercise 5.4.1.1 Assume the prime number theorem. 
(a) Show that there are infinitely many primes whose leading digit is a “1”. How about leading 
digit “7”? 
(b) Show that for all « > 0, if x is sufficiently large, then there are primes between x and z+ ex. 
(c) Deduce that Ryo is the set of limit points of the set {p/q: p,q primes}. 
(d) Let ai,...,@¢ be any sequence of digits, that is, integers between 0 and 9, with a; 4 0. 
Show that there are infinitely many primes whose first (leading) d digits are ai,..., aa. 


Exercise 5.4.2.1 Let py = 2 < pp = 3 < --- be the sequence of primes. Assume the prime 
number theorem and prove that 


Pn~n logn asn—- oo. 


Exercise 5.4.3.1 (a) Show that the sum of primes and prime powers < x is ~ x?/(2logz). 
(b) Deduce that if the sum equals N, then z ~ /N log N. 


M41y general, A(z) ~ B(x), that is, A(x) is asymptotic to B(a), is equivalent to 
limz-+00 A(x)/B(x) = 1. It does not mean that “A(a) is approximately equal to B(«)”, which has 
no strict mathematical meaning, rather that for any « > 0, no matter how small, one has 


(1 — €)B(z) < A(x) < (1+ €) B(x) 


once z is sufficiently large (where how large is “ sufficiently large” depends on €). This definition concerns 

the ratio A(x)/B(x), not their difference A(a) — B(x). Therefore n? +1~ n? and n? + n?/logn ~ n? 

are equally true, even though the first is a better approximation to n? than the second ({Sha8g5], p. 16), 
15 Including the so-called “elementary proof” of the prime number theorem. 


5.5. Bounds on the number of primes 89 


Primes in arithmetic progressions. As we mentioned in section [5.3}) Dirichlet 
showed in 1837 that if (a,q) = 1, then there are infinitely many primes p = a 
(mod q). Dirichlet’s proof was combined in 1896 with the proof of the prime number 
theorem to establish that 


a. ne) a 
a PS Peas pane p= and ap Gay agieet 


The factor “1/¢(q)” emerges as there are ¢(q) reduced residues a modulo gq. 


Exercise 5.4.4.! Use the prime number theorem in arithmetic progressions to prove that for any 
integers a1,...,@d,60,...,ba € {0,...,9}, with a1 # 0 and bo = 1, 3, 7, or 9, there are infinitely 
many primes whose first d digits are a1,...,a@q and whose last d digits are bg,..., bo. 


5.5. Bounds on the number of primes 


The first quantitative lower bound proven on the number of primes is due to Euler 
in the mid-18th century who showed that 


1 
‘> — diverges, 
S Pp 
p prime 

as we will prove in section of appendix 5B. This gives some idea of how 
numerous the primes are in comparison to other sequences of integers. For ex- 
ample )>s1 4 converges, so the primes are, in this sense, more numerous than 
the squares. This implies that there are arbitrarily large values of x for which 


wD) > af i: 


Exercise 5.5.1.1 Do better than this using Euler’s result. 
(a) Prove that }7,31 mee converges. 
(b) Deduce that there are arbitrarily large x for which (x) > 2/(logx)?. 


Next we will prove upper and lower bounds for the number of primes up to 2, 
of the form 


(5.5.1) C1 


<= < 
logx ~ m(a) Sez log x 

for some constants 0 < cy < 1 < cg, for all sufficiently large x. The prime number 
theorem is equivalent to being able to take cj = 1 — € and cz = 1+ € for any fixed 
€ > Oin (5.5.1). Instead we will prove Chebyshev’s weaker 1850 result that one can 


take any c,; < log2 and any cz > log4 in (6.5.1). 


Theorem 5.3. For all integers n > 2 we have 
n n 


(log2)  — 1 < w(n) < (log 4) +47 


log n 


Exercise 5.5.2. Fix « > 0 arbitrarily small. Deduce Chebyshev’s bounds (5.1) with cy = 
log 2 — € and cg = log4 +e, for all sufficiently large x, from Theorem 


logn log n)?° 


Proof. The binomial theorem states that (« + y)% = se (") JyN-J, Taking 
x=y=1 we get 


(5:5.2) 3 (*) = 2%. 


j=0 
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Lemma 5.5.1. The product of the primes up to N is <4N~! for all N > 1. 


Proof. Each prime in [n+1, 2n] appears in the numerator of the binomial coefficient 
Ce) but not in the denominator, and so their product divides (a), Now if 
N = 2n— 1 is odd, then (*"7') = (?"~') so the value appears twice in the sum in 


n-1 n 
(5.5.2). Therefore 


2n-1 
2n—-—1 1 2n-—1 
5. = = S = 22n-2 — gro), 
ee TM »<("7')<; ow 


n<p<2n j=0 
p prime 


We now prove the claimed result by induction on N > 1. The result is straight- 
forward for N = 1,2 by calculation. If N = 2n or 2n — 1, then the product of the 
primes up to N is at most the product of the primes up to n times the product of 
the primes in [n + 1,2n]. The first product is < 4"~1 by the induction hypothesis, 
and the second < 4"~! by the previous paragraph. Combining these two upper 
bounds gives the upper bound < 4?”~? < 4N—!) as claimed. 


If we take logarithms in (5.5.3), we obtain 


(5.5.4) S- log p < (n— 1) log4. 
p prime 
n<p<2n 


As each term of the left side is > login we deduce that 
—1 
(5.5.5) m(2n) — r(n) = #{p prime: n< p< 2n}< ian - log 4. 


We now use this to deduce the upper bound claimed in Theorem[5.3] We verify the 

bound by calculations for all N < 100 and then proceed by induction for N > 101. 

If N = 2n or 2n— 1 (so that n > 51), then by the induction hypothesis and (5.5.5) 
2n—1 n 


m(N) < r(2n) = r(n) + (7 (2n) — r(n)) < (log 4) log n ” "Teg mje" 


and for all n > 51 this is 


2n-1 , 2n-1 N N 


< (log 4) TGs Ne 


< (log 4) ae 


log2n ‘(log 2n)? 


as a careful calculation reveals. This yields the upper bound claimed in Theorem 
5.3] 

To obtain the lower bound claimed in Theorem [5.3] we begin by observing that 
the largest binomial coefficient ("") occurs with m = [n/2]. All the other binomial 
coefficients are smaller, as is (5) + ("), so that 


#=(()+()) +E) <mlwra) 
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by (5.5.2). Now, all prime factors of any (in 2) are <n, and in fact if p™ divides 
(nya) then p’? <n by Corollary B.10.1]to Kummer’s Theorem. Therefore 


Taking logarithms we deduce the claimed result. 


Exercise 5.5.3. Use exercise|3.10.3]and the last displayed equation to prove that 


Qn 
(5.5.6) Iem[m: m <n] > —. 
n 


5.6. Gaps between primes 


Let p,) = 2 < pp = 3 <.--- be the sequence of primes. We are interested in the 
possible gaps, pn+1 — Pn, between primes. 


The prime number theorem tells us that there are about «/log x primes up to 
x, so that the average gap between primes < z is about loga: If N = 7(x), then 
pn is the largest prime < 2, and py ~ x by exercise[5.4.1(b). This implies that the 
average gap between consecutive primes up to z is 


{= pn —2 x 
> N 
N-1 Zo +1 Pr) N-1 x/log x ee 


by the prime number theorem. 


Are there gaps between consecutive primes that are much smaller than the 
average? Much larger than the average? What is the largest that gaps between 
primes can be, and what is the smallest? 


Exercise 5.6.1. (a) Prove that there are gaps between primes < z that are at least as large 
as the average gap between primes up to x. 
(b) Prove that there are gaps between primes < zx that are no bigger than the average gap 
between primes up to z. 


Legendre conjectured that there are always primes between consecutive squares, 
that is, that there are primes in the interval (n?,(n + 1)?) for every integer n. 


Exercise 5.6.2. (a) Show that if every interval («,« + 2,/x) contains a prime, then there are 
always primes between consecutive squares. 
(b) Show that if there are always primes between consecutive squares, then every interval 
(a,x + 4,/x + 3] contains a prime. 


At present we do not know how to prove every interval (x, x + C./z) contains 
primes, for any given C > 0. However it has been proven, by Baker, Harman, and 
Pintz, that any interval (a, «+ cx? +10 | contains a prime for c sufficiently large. 


Exercise 5.6.3. Deduce from this that there is a prime between any consecutive, sufficiently 
large, cubes. 
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There is a simple way to construct a long interval which contains no primes: 


Proposition 5.6.1. For any integer m the interval m! +2, m!4+3,...,m!+m 
contains no primes. Therefore if py is the largest prime < m!+1, then prsi—pn = 
m. 


Proof. If 2 < 7 < m, then j is included in the product for m!, and so 7 divides 
m!+j. Therefore m! + 7 is composite as it is > 7. Now pny, 4 m!+ 7 for each 
such j and so pny, > Mm! +m+1>pyp+m. 


The gaps between primes constructed in this way are not quite as large as the 
average gaps. However one can extend this idea, creating a long interval of integers 
which each have a small prime factor, to prove that 


: Pn+1— Pn 
lim sup —————— = 00 
noo = LO Dn 
Proving this is again beyond the scope of this book but a proof can be found in 


Graal. 


What about small gaps between primes? 


Exercise 5.6.4. Prove that 2 and 3 are the only two primes that differ by 1. 


There are plenty of pairs of primes that differ by two, namely 3 and 5, 5 and 7, 
11 and 13, 17 and 19, etc., seemingly infinitely many, and this twin prime conjecture 
that there are infinitely many prime twins p, p+ 2 remains an open problem. Until 
recently, very little was proved about short gaps between primes, but that changed 
in 2009, when Goldston, Pintz, and Yildirim (see [I]) showed that 
lim inf pe 
n—+00 log pr 
In 2013, Yitang Zhang, until then a practically unknown mathematician [4] showed 
that there are infinitely many pairs of primes that differ by at most a bounded 
amount. More precisely there exists a constant B such that there are infinitely 
many pairs of distinct primes that differ by at most B. This was soon improved by 
Maynard and Tao, though by a different method, so that we now know there are 
infinitely many pairs of consecutive primes pp, Pn+1 such that 


Pn+1— Pn < 246. 


This is not quite the twin prime conjecture, but it is a very exciting development. 
(See [2] for a discussion.) 


The proofs of Maynard and of Tao yield a further great result: For any in- 
teger m > 3 there are infinitely many intervals of length 214 which contain 
m primes. That is, there are infinitely many m-tuples of consecutive primes 
PnsPntls+++;Pn+m—1 such that 


14 
Pntm—1— Pn < 2 


16See the movie Counting from Infinity (Zala Films, 2015) for an account of his fascinating story. 
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Further reading on hot topics in this section 
[1] K. Soundararajan, Small gaps between prime numbers: The work of Goldston-Pintz- Yildirwm, Bull. 
Amer. Math. Soc. (N.S.) 44 (2007), 1-18. 


{2] Andrew Granville, Primes in intervals of bounded length, Bull. Amer. Math. Soc. (N.S.) 52 (2015), 
171-222. 


5.7. Formulas for primes 


Are there polynomials (of degree > 1) that only yield prime values? That is, is 
f(n) prime for every integer n? The example 6n + 5 begins by taking the prime 
values 5, 11,17, 23,29 before getting to 35 = 5 x 7. Continuing on, we get more 
primes 41, 47,53, 59 till we hit 65 = 5 x 13, another multiple of 5. So every fifth 
term of the arithmetic progression seems to be divisible by 5, which we verify as 
6(5k) +5 = 5(64 +1). More generally gn +a is a multiple of a whenever n is 
a multiple of a, since g(ak) + a = a(qk +1). A famous example of a polynomial 
that takes lots of prime values is f(x) = x7 +2+41. Indeed f(n) is prime for 
0 <n < 39. However f(40) = 41? and f(41k) = 41(41k?+k+1). Therefore f(41k) 
is composite for each integer k for which 41k? +k +14 —1, 0, or 1. 

We will develop this argument to work for all polynomials, but we will need the 


following result, which is a consequence of the Fundamental Theorem of Algebra 
and is proved in Theorem [3.11] of section [3.22] in appendix 3F. 


Lemma 5.7.1. A non-zero degree d polynomial has no more than d distinct roots 
in C. 


The main consequence that we need is the following: 


Corollary 5.7.1. Suppose that f(x) € Z[x] has degree d > 1. For any integer 
B> 1, there are no more than (2B + 1)d integers n for which |f(n)| < B. 


Proof. If nis an integer, then so is f(n), and therefore if | f(n)| < B, then f(n) =m 
for some integer m with |m| < B. Therefore n is a root of one of the 2B + 1 
polynomials f(x) — m, each of which has no more than d roots by Lemma [5.7.1 
and so the result follows. 


Proposition 5.7.1. If f(x) € Z|] has degree d > 1, then there are infinitely many 
integers n for which |f(n)| is composite. 


Proof. By Corollary [5.7.1] there are no more than 3d integers n for which f(n) = 
—1, 0, or 1, so there exists an integer a in the range 0 < a < 3d for which |f(a)| > 1. 
Let m := |f(a)| > 1. Now km +a = a (mod m) and so, by Corollary [2.3.1] we 
have 
f(km +a) = f(a)=0 (mod m). 

There are at most 3d values of k for which km +a is a root of one of f(x) — m, 
f(x), or f(x)+m, by Corollary [5-71] For any other k we have that | f(km+a)| 4 0 
or m, in which case |f(km + a)| is divisible by m and |f(km+ a)| > m, so that 
| f(km + a)| is composite. 


Exercise 5.7.1. Show that if f(x,y) € Z[x,y] has degree d > 1, then there are infinitely many 
pairs of integers m,n for which |f(m,n)| is composite. 
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Nine of the first ten values of the polynomial 6n+5 are primes. The polynomial 
n?+n+41, discovered by Euler in 1772, is prime for n = 0,1,2,...,39 and the 
square of a prime for n = 40. However, in the proof of Proposition [5.7.1] we saw 
that n? +n +41 is composite whenever n is a positive multiple of 41. See section 
[12.5] for more on such prime rich polynomials. 


We discuss other places to look for primes in section [5.21] of appendix 5G. 


It is not difficult to show that if a polynomial f takes on infinitely many prime 
values, then f must be irreducible. The next result indicates how many prime 
values f needs to take before we know that f is irreducible. 


Theorem 5.4. If f(x) € Z[x] has degree d > 1 and |f(n)| is prime for > 2d+1 
integers n, then f(x) is irreducible. 


Proof. Suppose that f is reducible; that is, f = gh for polynomials g(x), h(x) 
€ Zia]. If |f(n)| = p, a prime, then g(n)h(n) = p or —p. Therefore one of g(n) 
and h(n) equals p or —p, the other 1 or —1. In particular n is a root of 
(g(a) — 1)(A(z) — 1)(g(x) + 1)(h(x) + 1), a polynomial of degree 2d. This has 
no more than 2d roots by Lemma [5.7.1] and so |f(n)| can be prime for no more 
than 2d integers n. 


This is often more than we need, as we see in the following beautiful result: 


Theorem 5.5. Write a given prime p in base 10 as p = ap + a,10 +--- + agl04 
(with each a; € {0,1,2,...,9} and aq #4 0). Then ap + aya +++-+aqx% is an 
irreducible polynomial. 


Proof. Let f(z) = aox +--- + aax% and suppose that f = gh. As g(10)h(10) 
is prime, one of g(10) and A(10) equals 1 or —1. We will suppose that it is g 
(swapping g and h if necessary). As g(x) € Z[a] it can be written in the form 
g(x“) = e[j_i(2 —aj;) with c € Z, and so gee |10 — aj| < |g(10)| = 1. Therefore 
there is a root a of g(x) for which |a — 10| < 1. This implies that Re(a) € [9, 11] 
and so Re(1/a) > 0 and |a| > 9. 

As f(a) = 0 we deduce that 


d 
0=Re (2) = dq + ag_1Re (=) + D_,44-iRe (=) : 


As discussed above Re(1/a) > 0 and so ag_;Re(1/a) > 0. On the other hand, 
Re(1/a’) might be negative and so ag_;Re(1/a‘) > —9/|a|’. Therefore 


d 
1 
Ue i+0-9 ) aaR 
1=2 


which implies that 


al 9 1 
1 — < 
<9) a = allel <3 


i>2 


as |a| > 9, which yields a contradiction. 


Exercise 5.7.2. Prove an analogous result for primes written in an arbitrary base b > 3. 
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Exercise 5.7.3.1 Suppose that f(x) = aoz +---+aga% € Z[x] with each |a;| < A and ag # 0. 
Prove that if f(m) is prime for some integer n > A + 2, then f(z) is irreducible. 


There are many books on the distribution of primes. My favorites for beginners 
are |IT'MEF00) which explains the key ideas behind the prime number theorem and 
other important results in an accessible way, and which is more recreational 
but full of good stuff. The introductory book |H'W0O8}] proves quite a few of the 
easier theorems in the subject. 


Additional exercises 


Exercise 5.8.1. Let m be the product of the primes < 1000. Prove that if n is an integer between 
10° and 10°, then n is prime if and only if (n,m) = 1. 


Exercise 5.8.2. Show that if p > 3 and q = p+ 2 are twin primes, then p + q is divisible by 12. 


Exercise 5.8.3. Show that there are infinitely many integers n for which each of n,n+1,...,n+ 
1000 is composite. 


Exercise 5.8.4. Fix integer m > 1. Show that there are infinitely many integers n for which 
T(n) =m. 


Exercise 5.8.5.' Fix integer k > 1. Prove that there are infinitely many integers n for which 
B(n) = p(n +1) =--- = p(n +k). 


Exercise 5.8.6. Let H be a proper subgroup) 4 of (Z/mZ)*. 
(a) Show that if a is coprime to m and q is a given non-zero integer, then there are infinitely 
many integers n =a (mod m) such that (n,q) = 1. 
(b) Prove that if n is an integer coprime to m but which is not in a residue class of H, then n 
has a prime factor which is not in a residue class of H. 
(c) Deduce there are infinitely many primes which do not belong to any residue class of H. 


Exercise 5.8.7.1 Suppose that for any coprime integers a and q there exists at least one prime 
a (mod q). Deduce that for any coprime integers A and Q, there are infinitely many primes 


A (mod Q). 


Exercise 5.8.8. Prove that there are infinitely many primes p for which there exists an integer 
a such that a? —a+1=0 (mod p). 


Exercise 5.8.9. Prove that for any f(x) € Z[a] of degree > 1, there are infinitely many primes 
p for which there exists an integer a such that p divides f(a). 


Exercise 5.8.10. Let £(n) = Icm[1,2,..., nl]. 
(a) Show that £(n) divides L(n 4+ 1) for all n > 1. 
(b) Express £(n) as a function of the prime powers < n. 
(c) Prove that for any integer k there exist integers n for which L(n) = L(n+1) =--- = L(n+k). 
(d)* Prove that if k is sufficiently large, then there is such an integer n which is < 3*. 


Exercise 5.8.11.' Prove that 


Lite) | >1las t>o0. 


Exercise 5.8.12. Prove that 1 is the best choice for B when approximating Li(x) by z/(log z— B). 


Exercise 5.8.13.1 Using the Maynard-Tao result, prove that there exists a positive integer k < 
246 for which there are infinitely many prime pairs p,p + k. 


17 H is a proper subgroup of G if it is a subgroup of G but not the whole of G 
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Exercise 5.8.14. Suppose that a and 6 are integers for which g(a) = 1 and g(b) = —1, where 
g(x) € Za]. 
(a) Prove that b=a—2,a—1,a+1,ora+2. 
(b)* Deduce that there are no more than four integer roots of (g(a) — 1)(g(z) +1) =0. 
(c)t Show that if g(a) has degree 2 and there are four integer roots of (g(x) — 1)(g(x) +1) = 0, 
then g(x) = h(a — A) where h(t) = t? — 3t +1, with roots A, A+ 1, A+ 2, and A+3. 
(d)t Modify the proof of Theorem [5.4] to establish that if f(a) € Z[x] has degree d > 6 and 
|f(n)| is prime for > d+ 3 integers n, then f(z) is irreducible. 


Let f(x) = h(ax)h(a — 4), which has degree 4. Note that |f(n)| is prime for the eight values 
n =0,1,...,7, and so there is little room in which to improve (d). 


One can show that there are reducible polynomials f(x) € Z[a] of arbitrarily 
large degree d for which | f(n)| takes on at least d+1 prime values: Let py <--+ < pm 
be distinct primes. Let g(x) = []j"., (pj x?) and q = g(1). By Dirichlet’s Theorem 
(section [5.3) we know that there are infinitely many primes pp = 1 (mod gE) We 
select one such prime and write pp = 1+ @q for some positive integer 2. Now let 
f(x) = x(14+ €g(x)) which has degree d := 2m+1. We have that |f(+1)| = 1+éq = 
po and |f(+p;)| =p; for j =1,...,m, so there are > 2m +2 =d+1 integers n for 
which | f(n)| is prime. 


In the next exercise, assuming certain conjectures|!) we construct reducible 
polynomials f(a) € Z[a] of arbitrarily large degree d for which |f(n)| takes on d+ 2 
prime values. This implies that the result in exercise [5.8.14[d) is “best possible”. 


Exercise 5.8.15.' Assume that there are infinitely many positive integers n for which n? —3n+1 
is prime, and denote these integers by ny < n2 <---. Let gm(a#) := (n1 — x)---(nmm — x). If & 
is a positive integer for which 1 + gm(0),1+ gm(1),1+4+ lgm(2),1+ £9m(3) are simultaneously 
prime, then prove that the polynomial f(a) := (x? — 32 + 1)(1+ lgm(a)) has degree d:= m +2 
and that there are exactly d+ 2 integers n for which |f(n)| is prime. 


18 We will prove this later, in Theorem [7.3] 
19These conjectures follows from the Polynomial prime values conjecture stated in the bonus 
section of this chapter. 
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5.9. Bertrand’s postulate 
In 1845 Bertrand conjectured, on the basis of calculations up to a million: 


Theorem 5.6 (Bertrand’s postulate). For every integer n > 1, there is a prime 
number between n and 2n. 


Bertrand’s postulate was proved in 1850 by Chebyshev. We will follow the 
19-year-old Erdés’s proof, or, as N. J. Fine put it (in the voice of Erdés): 


Chebyshev said it, but I'll say it again: 
There’s always a prime between n and 2n. 


Exercise 5.9.1. Show that prime p does not divide Ce when 2n/3 <p<n. 


Proof of Bertrand’s postulate. Let p°? be the exact power of prime p dividing 
(7”). We know that 


e €p =1ifn< p< 2n by Kummer’s Theorem (Theorem B.7), 
e €p = 0 if 2n/3 < p< n by exercise [5.9.1 
e Cp < Lif V2n < p< 2n by Corollary BIO 
« pee < 2n if p < 2n by Corollary B.10-1) 
Combining these gives 


=~ <("\= Tors > We Im 


pS2n n<ps2n p<2n/3 p<V2n 


IA 


II p| x 42n/3-1 y, (2n)(V2n41)/2, 


n<p<2n 
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using Lemma to bound J],<2n/3p and the bound m(V2n) < $(v2n +1) (as 
neither 1 nor any even integer > 2 is prime). Taking logarithms we deduce that 


log 4 V2 3 
S- log p > aE i aE log(2n). 
P 3 2 
p prime 
n<p<2n 
This implies that 
1 
9.1 ] >= 
(5.9.1) Y togpe tn 
p prime 
n<ps2n 


for all n > 2349, which implies Bertrand’s postulate in this range. (This lower 
bound should be compared to the upper bound (5.5.4).) 


If 1 < nm < 5000, then the interval (n, 2n] contains at least one of the primes 2, 
3, 5, 7, 13, 23, 43, 83, 163, 317, 631, 1259, 2503, and 5003. 


Exercise 5.9.2. Use Bertrand’s postulate to prove that there are infinitely many primes with 
first digit “1”. 


Exercise 5.9.3. Use Bertrand’s postulate to show, by induction, that every integer n > 6 can be 
written as the sum of distinct primes. 


Exercise 5.9.4. Goldbach conjectured that every even integer > 6 can be written as the sum of 
two primes. Deduce Bertrand’s postulate from Goldbach’s conjecture. 


Exercise 5.9.5. Use Bertrand’s postulate to prove that hy t why Pec x is never an integer. 
Exercise 5.9.6. Prove that for every n > 1 one can partition the set of integers {1,2,...,2n} 
into pairs {a1,b1},...,{@n, bn} such that each sum a; + bj; is a prime. 


Exercise 5.9.7.1 (a) Prove that prime p divides -) when n/2 < p< 2n/3. 
(b) Prove that the product of the primes in (3m, 12m] divides Lo Co. 
(c)t Deduce that we can take any constant c2 > & log(432) in (6.5.1). 

(Note that 2 log(432) = 1.3485... < log 4 = 1.3862...) 


(d) Now deduce Bertrand’s postulate for all sufficiently large x from (5.5.1). 


5.10. The theorem of Sylvester and Schur 


Bertrand’s postulate can be rephrased to state that at least one of the integers 
k+1,k+2,...,2k has a prime factor > k. This can be generalized as follows: 


Theorem 5.7 (Sylvester-Schur Theorem). For any integers n > k > 1, at least 
one of the integers n+1,n+2,...,n+k is divisible by a prime p> k. 


Proposition 5.10.1. Jf, for given integers n >k > 1, we have 


(5.10.1) (" ; *) > (n+ kT, 


then at least one of the integers n+1,n+2,...,n+k is divisible by a prime p > k. 
If (6.10.1) holds for ny(k), then tt holds for all n > n1(k). 


5.10. The theorem of Sylvester and Schur 99 


Proof. If the prime factors of n+1,n+4+ 2,...,n+k are all < k, then all of the 
prime factors p of Ce) are < k. If p®|| (oe then p® <n+k by Corollary B.10.1 
Therefore 


n+k 
_ n(k) 
(5.10.2) ( i ) <[[ mth =(n+h)™, 
pSk 
contradicting (6.10.1). This proves the first part of the result. 


We prove the second part by induction on n > ni (k) using the following result. 


k 
Exercise 5.10.1. Prove that ( + a) < (1 + sit) for alla >k> 1. 


The result holds for n = i(k), so now suppose that (5.10.1) holds for some 
given n. Then 


_ 1(k) 1(k) 
( k; )=( =) ( k J>( —) vent) ae a 


by exercise |5.10.1} and the induction hypothesis, and so (5.10.1) holds for n + 1. 
The result follows. 


Proof of the Sylvester-Schur Theorem for all & < 1500. Calculations give 
some value for n;(k) in Proposition [5.10.1] for all k < 1500, and so the Sylvester- 
Schur Theorem follows for these k and all n > ni(k) by Proposition 6.10.1] Now 
ni(k) = k for 202 < k < 1500, and k < ni(k) < k4+17 for all k < 201. We verify 
the theorem for k <n <k+16 with k < 201, case by case. 


A just failed proof of the Sylvester-Schur Theorem. Calculations suggest 
that (7) > (2k)™) for all k > 202. If so, the Sylvester-Schur Theorem follows for 
all k > 202 by Proposition [5.10.1] However we just failed to prove this inequality 
as a consequence of the upper bound in Theorem [5.3] If one combines the upper 
bound on 7(k/4) from Theorem [5.3] together with exercise [5.9.7(b), then we can 
prove that ) > (2k)"™) for all sufficiently large k. However “sufficiently large” 
here is likely to be extremely large. 


Exercise 5.10.2. Prove that if m(k) < en —1 for all integers k > 1, then Theorem [5.7Jholds 
foraln>k>1. 


Proof of the Sylvester-Schur Theorem for all k > 1500. If (5.10.1) holds, 
then the result follows from Proposition 5.10.1] Hence we may assume that (5.10.2) 
holds. Now, 7(k) < k/3 (which can be proved by accounting for divisibility by 2 
and 3), and aed > Bt for 7 =0,...,k — 1 so that a) > (24*)*. Therefore 
(5.10.2) implies that 


k 
(“=*) < ("5") 2 (n +k") < (n + k)*/3, 


which in turn implies that 


n+k< k3/?. that is, n < 73/2 — pe. 


100 Appendix 5A. Bertrand’s postulate and beyond 


Next we note that if p > (n+k)'/? and p°|| ioeaaa) so that p° <n+k, then e =0 
or 1. Therefore we can refine (5.10.2) to 


k : 
(5.10.3) eS ) < [J (™+h& [[ =e ae, 
p<(n+k)}/? PSk 
by G54), as m((n + k)/?) < (n+ kh)? < £K3/4. 
Now if n > 3k, then, by exercise [4.14.2] of appendix 4D, 
(4*/3%)* Pe ee ee Ces ee paket gk-1 
ek TAK) k - 


which is false for all k > 1. Thereforen+k < 4k, and so ifn+k> 3k, then our 
inequality becomes 

Geyaeo i. (572 n+k 1/2 py 

SPST (BRD) (MEP) celta 
This is false for all k > 780. 

Finally for the range k < n < 3k/2 if prime p is in the range (n+k)/3 <p<k, 

then 2p is the only multiple of p that appears in (n+ 1)---(n+k) and so p does 
not divide Gua? Therefore 


G<Ce)s Tl @+) TT vs I] core’ TT» 


p<(n-+k)3/2 PS(m+k)/3 p< (ntk)}/2 pSdk/6 


which implies that 
4k 1/2 

caer Ak k 45k/6-1 
ek — ah) 

which is false for all k > 1471. 


Exercise 5.10.3. (a) Use Bertrand’s postulate and the Sylvester-Schur Theorem to show tha 
if 1 <r<_s, then there is a prime p that divides exactly one of the integers r+ 1,...,s. 
(b) Deduce that if 1 <r <_s, then ai i + is never an integer. 


Bonus read: A review 
of prime problems 


5.11. Prime problems 


In this bonus section we will discuss various natural sequences that are expected to 
contain infinitely many primes, highlighting recent progress. 


Mathematicians have tried in vain to discover some order in the sequence 
of the prime numbers and we have every reason to believe that there are 
some mysteries that the human mind shall never penetrate. 

— LEONHARD EULER (1740) 


Prime values of polynomials in one variable 


In section [5.6] we mentioned the twin prime conjecture, that there are infinitely 
many pairs of primes that differ by 2. What about other pairs? Obviously there 
can be no more than one pair of primes that differ by an odd integer k (as one of 
the two integers must be divisible by 2), but when the difference is an even integer 
k; there is no such obstruction. Calculations then suggest that: 


For all even integers 2m > 0 there are infinitely many pairs of primes that differ 
by 2m. That is, there are infinitely many prime pairs p,p + 2m. 


Here we asked for simultaneous prime values of two monic linear polynomials x 
and #+2m. What if we select polynomials with different leading coefficients, like x 
and 2x+1? Such prime pairs come up naturally in Sophie Germain’s Theorem 
(of section [7.27]in appendix 7F) and calculations support the guess that there are 
many (like 3 and 7; 5 and 11; 11 and 23; 23 and 47;...). We therefore conjecture: 


There are infinitely many pairs of primes p,2p + 1. 


One can generalize this to other pairs of linear polynomials but we might again 
have the problem that at least one is even, as with p,3p-+ 1. 


101 


102 Bonus read: A review of prime problems 


Exercise 5.11.1. Give conditions on integers a,b,c,d with a,c > 0, assuming that (a,b) = 
(c,d) = 1, which guarantee that there are infinitely many integers n for which an + b and cn+d 
are different and both positive and odd. We conjecture, under these conditions that: 


There are infinitely many pairs of primes am + b,cm-+ d. 


For triples of linear forms and even k-tuplets of linear forms, there are more 
exceptional cases. For example, the three polynomials n,n + 2,n + 4 can all si- 
multaneously take odd values but, for each integer n, one of them is divisible by 
3. We call 3 a fized prime divisor, which plays the same role as 2 in the ex- 
ample n,n +k with k odd. In general we need that a given set of linear forms 
a,x +b,,a9" 4+ bo,...,a,x2 4+ by with integer coefficients is admissible; that is, there 
is no fixed prime divisor p. Specifically, for each prime p, there exists an integer 
Np» for which none of the a;n, + b; is divisible by p, which implies that p does not 
divide ajn +b; for 1 < 7 <k for every integer n =n, (mod p). This leads us to 


The prime k-tuplets conjecture. Let ajx + bi,...,a,u + by be an admissible 
set of k linear polynomials with integer coefficients, such that each a; is positive. 
Then there are infinitely many positive integers m for which 

aym+bj,...,a,m + by are all prime. 


Exercise 5.11.2.' Assuming the prime k-tuplets conjecture deduce that there are infinitely many 
pairs of consecutive primes p, p + 100. 


Exercise 5.11.3.' Assuming the prime k-tuplets conjecture deduce that there are infinitely many 
triples of consecutive primes in an arithmetic progression. 


Exercise 5.11.4.' Assuming the prime k-tuplets conjecture deduce that there are infinitely many 
quadruples of consecutive primes formed of two pairs of prime twins. 


Exercise 5.11.5.' Let anii1 = 2an +1 for all n > 0. Fix an arbitrarily large integer N. Use the 
prime k-tuplets conjecture to show that we can choose ag so that ag9,a1,...,@y are all primes. 


Exercise 5.11.6. Show that the set of linear polynomials aim-+ 1,agm+1,...,a@,m+1, with 
each aj; positive, is admissible. 


There is more on prime k-tuplets of linear polynomials in appendix 5E. 


What about other polynomials? For example, the polynomial n? + 1 takes 
prime values 2,5,17,37,101,... seemingly on forever, so we conjecture that: 


There are infinitely many primes of the form n? +1. 


The polynomial x? + 2x2 cannot be prime for many integer values since it is 
reducible (recall Theorem [5.4] and exercise [5.8.14{c)). This is a different reason 
(from the fixed prime factors above) for a polynomial not to take more than finitely 
many prime values. These are the only reasons known for a polynomial not to take 
infinitely many prime values and, if neither of them holds, then we believe that the 
polynomial does take on infinitely many prime values. More precisely: 


Polynomial prime values conjecture. Let fi(x),..., fx(x) € Z[x], each irre- 
ducible, with positive leading coefficients. If f,--- fy has no fixed prime divisor, 
then: 
There are infinitely many integers m for which fi(m),..., fe(m) are all prime. 
To be precise, if f,,..., f, have “no fixed prime divisor” then we mean that for 
every prime p there exists an integer n, such that fi(np)--- fx(mp) is not divisible 
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by p. The polynomial prime values conjecture specialized to linear polynomials is 
the prime k-tuplets conjecture24 


Exercise 5.11.7. Prove that the only prime pair p, p? + 2 is 3,11. 


Exercise 5.11.8. (a) Prove that if f1--- f, has no fixed prime divisor, then, for each prime p, 
there are infinitely many integers n such that fi(n)--- f,(n) is not divisible by p. 
(b)* Show that if p > deg(f1(ax)--- f,(x)) and p does not divide fi(x)--- f(z), then np exists. 
(c) Prove that if f;(x) = «+h; for given integers hy1,..., hx, then np exists for a given prime 
p if and only if #{distinct h; (mod p)} < p. 


The only case of the polynomial prime values conjecture that has been proved 
is when & = 1 with f(.) is linear. The hypothesis ensures that f(a) = qx +a 
with g > 1 and (a,q) = 1. This is Dirichlet’s Theorem (that there are infinitely 
many primes = a (mod q) whenever (a, q) = 1, which we discuss in sections[8.17] of 
appendix 8D and[I3.7). 


Distinguishing primes and P;,’s from other integers. The Mobius function 
was introduced in section [4.5] and in Corollary [4.5.1] we saw that the sum 


S>u(d) 
d|n 


is non-zero only if n = 1 and so allows us to distinguish the integer 1 from all other 
positive integers. In section of appendix 4B we saw that if the sum 


>_ H(d) log(n/d) 
d|n 


is non-zero, then n has exactly one prime factor and so allows us to distinguish 
primes and prime powers from all other positive integers. A positive integer is 
called a “P;,” if it has no more than k distinct prime factors. In the next exercise 
we will see how an analogous sum allows us to distinguish P;’s. 


Exercise 5.11.9.1 (a)? Let 2o,...,am be variables. Prove that if m > k > 0, then 
k 
coe (x0 +> ;) =0. 
SC{1,2,...,.m} jes 


(b) Deduce that if n has more than k different prime factors, then 


d= u(4)(log(n/d))* = 0. 


d\n 
(c)? What value does this take when n has exactly k different prime factors? 


Exercise 5.11.10. Show that if each prime factor of n is > n1/3, then n is either prime or the 
product of two primes. 


Prime values of polynomials in several variables 


One can ask for prime values of polynomials in two or more variables, for example, 
primes of the form m?+n? or the form a? +b? +1 or more complicated polynomials 
of mixed degree like 4a? + 27b?. What is known? 


20This conjecture was first formulated by Andrzej Schinzel in 1958. He called it “Hypothesis H” 
in that paper, and the name has stuck. 
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The proof of the prime number theorem can be adapted to many situations, 
for example to primes of the form m? +n? or the form 2u? + 2uv + 3v? or indeed 
the prime values of any irreducible binary quadratic form (which are discussed in 
chapters 9 and 12) without a fixed prime divisor. The proof for m? + n? uses the 
fact that m? + n? = (m-+in)(m— in), the norm of m+ in. One can develop this 
to prove that any such norm form (the appropriate generalizatior2}] of m? + n? 
to higher degree) takes on infinitely many prime values as long as it has no fixed 
prime factor. A norm form is always a degree d polynomial in d variables. 


One can then ask for prime values of norm forms in which we fix some of the 
variables (perhaps to 0). For example, if m = 1 in m?+n?, we are back to the open 
question about prime values of n? + 1. However in 2002 Heath-Brown was able to 
prove that a? + 2b? takes on infinitely prime values and then extended this, with 
Moroz, to any irreducible cubic form in two variables. In 2018, Maynard proved 
such a result for a family of norm formd2] in 3m variables of degree 4m (or less). 


These results on norm forms were all inspired by Friedlander and Iwaniec’s 1998 
breakthrough in which they took n to be a square in m? + n? (and therefore found 
prime values of u? + v4), following Fouvry and Iwaniec’s 1997 paper in which they 
took n to be prime (and therefore obtained infinitely many prime pairs p,m? +p”). 
This was the first example in which the polynomial in question is sparse in that the 
number of integer values it takes up to x is roughly «° for some c < 1. The current 
record sparsity is c = 2 from the work of Heath-Brown and Moroz. In 2017, Heath- 
Brown and Xiannan Li went beyond the Fouvry-Iwaniec and Friedlander-Iwaniec 
results by showing that there are infinitely many prime pairs p,m? + p*. 


In every case we expect that the proportion of values of the polynomial up to 
x which are prime is about c/ log, where c is a constant which depends on how 
often each prime divides values of the polynomial. 


Back in 1974, Iwaniec had shown how versatile sieve methods could be by 
showing that any quadratic polynomial in two variables (which is irreducible and 
has no fixed prime divisor) takes on infinitely many prime values, for example, 
m? + n?+1. We will see this result put to good use in appendix 12G when tiling 
a circle with smaller circles. 


What about the prime values of more than one polynomial in several variables? 
We can generalize our conjectures as follows: 


Multivariable polynomial prime values conjecture. Let fi(x1,...,2n),---; 
fe(a1,.--,2n) € Zlai,...,%n], each of which is irreducible. Suppose that there are 
infinitely many n-tuplets of integers m1,...,™Mn for which each f;(m1,...,™Mn) ts 
positive. If fi--- fx has no fixed prime divisor, then there are 
Infinitely many n-tuplets of integers m1,...,M%ny for which 
fi(mi,..-,™n),---;fe(m1,..-,1M%n) are all prime. 


In 1939, van der Corput showed that there are infinitely many three-term arith- 
metic progressions of primes, which can be written as 


a,a+d,a+ 2d, 


21More precisely the norm of >); 214i where the w; are a basis for the ring of integers of some 
number field of degree d and the x; are the variables. 
?2-"The norm of pp xz;,w’ where the field, of degree 4m, is generated by w over Q. 


Goldbach’s conjecture and variants 105 


three degree-one polynomials in two variables. For a long time, methods seemed 
inadequate to extend this to length four arithmetic progressions, but this was re- 
solved in 2008 by Green and Tao, who proved that for any fixed integer k > 3 there 
are infinitely many prime k-tuplets of the form 


a,a+d,a+2d,...,a+(k-—1)d. 


The methods used were quite new to the search for prime numbers and this has 
led to widespread interest. In 2012, along with Ziegler, they were able to prove a 
very general result for linear polynomials, which is as good as one can hope for, 
given that there has been no progress directly on the prime k-tuplets conjecture: 


Until we prove the twin prime conjecture we will be unable to prove the mul- 
tivariable polynomial prime values conjecture, in full generality, even for linear 
polynomials, since two of the polynomials might differ by two, for example if «+ 3y 
and «+3y+2 are in our set. More generally, without progress on the prime k-tuplets 
conjecture, we must avoid any linear relation between two of our polynomials. 


Theorem 5.8 (The Green-Tao-Ziegler Theorem). Suppose that fi(x),..., f(x) 
are linear polynomials which satisfy the hypothesis of the multivariable polynomial 
prime values conjecture. Moreover assume that if 1<t<j<k, there do not exist 
integers a,b,c, not all zero, for which af;+bf; =c. Then there are infinitely many 
m € Z” for which f,\(m),..., f(a) are all prime. 


We will discuss applications of the Green-Tao-Ziegler Theorem in appendix 5E. 


It is not difficult to show that there are infinitely many primes of the form 
b? — 4ac, the discriminant of an arbitrary quadratic polynomial. However we do 
not know how to prove that there are infinitely many primes of the form 4a? + 
27b7, the discriminant of the cubic polynomial x? + ax + b. Proving this would 
have a significant impact on our understanding of various questions about degree 3 
Diophantine equations. 


Exercise 5.11.11. Let g(x) =1+ ms (a—j). Prove that there exist integers a and b such that 
the reducible polynomial f(a) = (ax + 6)g(«) is prime when « = n for 1 < n < k. Compare this 
to the result in exercise[5.8.14{c) (with d= k +1). 


Goldbach’s conjecture and variants 


Goldbach’s 1742 conjecture is the statement that every even integer > 4 can be 
written as the sum of two primes. It is still an open question though it has now 
been verified for all even numbers < 4 x 101°. 


Great problems motivate mathematicians to think of new techniques, which 
can have great influence on the subject, even if they fail to resolve the original 
question. For example, although there have been few plausible ideas for proving 
Goldbach’s conjecture, it has motivated some of the development of sieve theory, 
and there are some beautiful results on modifications of the original problem. The 
most famous are: 

In 1975 Montgomery and Vaughan showed that if there are any exceptions to 


Goldbach’s conjecture (that is, even integers n that are not the sum of two primes), 
then there are very few of them. 
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In 1973 Jingrun Chen showed that every sufficiently large even integer is the 
sum of a prime and an integer that is the product of at most two primes. Here 
“sufficiently large” means enormous. 


In 1934 I. M. Vinogradov proved that every sufficiently large odd integer is the 
sum of three primes. The “sufficiently large” has recently been removed: Harald 
Helfgott, with computational assistance from David Platt, proved that every odd 
integer > 1 is the sum of at most three primes. 


Exercise 5.11.12. Show that the Goldbach conjecture is equivalent to the statement that every 
integer > 1 is the sum of at most three primes 


Other questions 


Before this chapter we asked if there are infinitely many primes of the form 2? — 1 
(Mersenne primes) or of the form 2?” +1 (Fermat primes). We can ask other 
questions in this vein, for example prime values of second-order linear recurrences 
which start 0,1 (like the Fibonacci numbers) or their companion sequences (see 
exercise [3.9.3] or prime values of high-order linear recurrence sequences. 


Mersenne primes written in binary look like 111...111, and so are palindromic. 
Some people have been interested in primes of the form 5 (10” — 1) which equal 
111...111 in base 10 and so are palindromic. We are unable to prove there are 
infinitely many Mersenne primes, so how about the easier question, are there infin- 
itely many palindromic primes when written in binary or in decimal or indeeed in 
any other base? Also open. 


We saw earlier that it is not difficult to show that there are infinitely many 
primes with the first few digits given. But how about missing digits? Can one find 
infinitely many primes which have no 7 in their decimal expansion or no 9 or no 
consecutive digits 123? These questions are all answered in a remarkable recent 
paper of Maynard [4]. 


Let M be a given n-by-n matrix. The (i,7)th entry of M,M?,... can all be 
described by an nth-order linear recurrence sequence. ‘To see this think of the 
2 0 
0 1 
many prime values. A recent question of interest is to take two (or more) such 
matrices M and N say, and then look at the entries of all “words” created by M 
and N, for example M*N?M°.--N*, and ask whether the entries are infinitely 
often prime (see section of appendix 9D and appendix 12G for a beautiful 
example). 


powers of ( }. We have already asked whether the trace can take infinitely 
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Appendices. The extended version of chapter 5 has the following additional 
appendices: 

Appendix 5B. An important proof of infinitely many primes. We give Euler’s 
proof that there are infinitely many primes (which yields that the sum of the re- 
ciprocals of the primes diverges) and use this to show that the primes make up a 
vanishing proportion of the integers. We use this to introduce the Riemann zeta- 
function, as well as Riemann’s program for proving the prime number theorem. 


Appendix 5C. What should be true about primes? Here we explain Crameér’s 
model for the distribution of primes based on Gauss’s thoughts and determine what 
it predicts about the expected longest gaps between primes. 


Appendix 5D. Working with Riemann’s zeta-function. We further develop Rie- 
mann’s program for proving the prime number theorem, detailing how the zeros of 
the Riemann zeta-function relate to the count of primes. We are therefore able to 
state the Riemann Hypothesis and discuss some attractive reformulations. 

Appendix 5E. Prime patterns: Consequences of the Green-Tao Theorem. We 


look for all sorts of prime patterns and at fun questions about primes, for example 
magic squares of primes like 


41 | 71 | 103 | 61 
17 | 89 | 71 97 | 79 | 47 | 53 
113 | 59 | 5 37 | 67 | 83 | 89 
A7 | 29 | 101 101 | 59 | 43 | 73 


Examples of magic squares of primes. 


Appendix 5F. A panoply of prime proofs presents several further proofs that 
there are finitely many primes, one by point-set topology, another using irrational- 
ity, and yet another via a counting argument. 


Appendix 5G. Searching for primes and prime formulas. We look for formulas 
for primes, including Matijasevic’s amazing polynomial in 26 variables, discuss their 
value, explore Conway’s prime-producing machine and patterns in Ulam’s spiral. 


Appendix 5H. Dynamical systems and infinitely many primes. Developing a 
perspective on Euclid’s original proof, we show that there are many different poly- 
nomials for which there are infinitely many prime divisors of the iterated values of 
the polynomial, starting from a non-periodic point. 


Chapter 6 


Diophantine problems 


Diophantine equations are polynomial equations in which we study the integer or 
rational solutions. They are named after Diophantus (who lived in Alexandria in 
the third century A.D.) who wrote up his understanding of such equations in his 
thirteen volume Arithmetica (though only six part-volumes survive today). This 
work was largely forgotten until interest was revived by Bachet’s 1621 translation 
of Arithmetica into Latin[}| 


6.1. The Pythagorean equation 


Right-angled triangles with sides 3, 4,5 and 5, 12, 13, etc, were known to the ancient 
Babylonians. We wish to determine all right-angled triangles with integer sides, 
which amounts to finding all solutions in positive integers x, y, z to the Pythagorean 
equation 

ety=2?. 
Note that z >2z,y >0as a, y, and z are all positive. We can reduce the problem, 
without loss of generality, so as to work with some convenient assumptions: 


e That x, y, and z are pairwise coprime, by dividing through by their ged, as 
in exercise [1.7.8 


e That x is even and y is odd, and therefore that z is odd: First note that x 
and y cannot both be even, since x, y, and z are pairwise coprime; nor both 
odd, by exercise 2.5.6(b). Hence one of x and y is even, the other odd, and 
we interchange them, if necessary, to ensure that x is even and y is odd. 


Under these assumptions we reorganize the equation and factor to get 


(g-yety)=2-y =a’. 


Translations of various ancient Greek texts into Latin helped inspire the Renaissance. 
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We now prove that (z — y,z + y) = 2: We observe that (z — y,z + y) divides 
(z+ y)—(z-—y) = 2y and (z + y) + (z — y) = 22, and that (2y,2z) = 2(y, z) = 2. 
Therefore (z — y, z+ y) divides 2, and so equals 2 as z— y and z+ are both even. 

Therefore, since (z — y)(z+ y) = x? and (z—y,z+y) = 2, there exist integers 
r,s such that 


z—y=2s? and zt+y=2r?; or z-—y=-—2s? and z+y=-—2r’, 


by exercise[B.3.7(c). The second case is impossible since r?, y, and z are all positive. 
From the first case we deduce that 


g=2rs, y=r?-s?, and z=r’?+s?. 


To ensure that x, y, and z are pairwise coprime we need (r,s) = 1 and r + s odd. 
If we now multiply back in any common factors, we get the general solution 


(6.1.1) g=2grs, y=ag(r? — 5s”), and z= g(r? +8”). 


If we want an actual triangle, then the side lengths should all be positive so we 
may assume that _g > 0 and r > s > 0, as well as (r,s) = 1 and r and s having 
different parities] The reader should verify that the integers x, y, and z given by 
this parametrization always satisfy the Pythagorean equation. 


ar 43") 


2grs 


Figure 6.1. Parameterization of all integer-sided right-angled triangles. 


One can also give a nice geometric proof of the parametrization in (6.1.1). We 
start with a reformulation of the question. 


Exercise 6.1.1. Prove that the integer solutions to x? + y? = z? with z > 0 and (a, y,z) = 1 are 
in 1-to-1 correspondence with the rational solutions u,v to u? + v? = 1. 


Where else does a line going through (1,0) intersect the circle x? + y? = 1? 
Unless the line is vertical it will hit the unit circle in exactly one other point, which 
we will denote by (u,v). Note that u < 1. If the line has slope t, then t = v/(u—1) 
is rational if wu and v are. 


?That is, one is even, the other is odd. 
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Figure 6.2. A line through (1,0) on the circle x? + y? = 1. 


In the other direction, the line through (1,0) of slope t is y = t(a — 1) which 
intersects x? + y? = 1 where 1 — 2? = y? = t?(x — 1)”, so that either z = 1 and 
y = 0, or we have 1 + 2 = t?(1— 2), which yields the point (u,v) with 

al —2t 

u = >— and v= ~——. 

ep aoe 
These are both rational if t is. We have therefore proved that u,v € Q if and only 
if t € Q. In other words the line of slope t through (1,0) hits the unit circle again 
at another rational point if and only if t is rational, and then we can classify those 


points in terms of t. Therefore, writing t = —r/s where (r,s) = 1, we have 
r2 — gs? d 2rs 
= —>—, and v= =—75, 
r2 + s2 r2 + 5? 


the same parametrization to the Pythagorean equation as in (6.1.1) when we clear 
out denominators. 


Exercise 6.1.2.1 Find a formula for all the rational points on the curve #? — y? = 3. 


Exercise 6.1.3. We call {a,b,c} a primitive Pythagorean triple if a, b, and c are pairwise coprime 
integers for which a? + b? = c?. 

(a) Prove that, in a primitive Pythagorean triple, the difference in length between the hy- 
potenuse and each of the other sides is either a square or twice a square. 

(b) Can one find primitive Pythagorean triples in which the hypotenuse is three units longer 
than one of the other sides? Either give an example or prove that it is impossible. 

(c)' One can find primitive Pythagorean triples in which the hypotenuse is one unit longer 
than one of the other sides, e.g., {3,4,5}, {5,12,13}, {7,24,25}, {9,40,41}, {11,60, 61}. 
Parametrize all such solutions. 

(d)t One can find primitive Pythagorean triples in which the hypotenuse is two units longer than 
one of the other sides, e.g., {3,4,5}, {8,15,17}, {12,35,37}, {16, 63,65}, {20,99, 101}. 
Parametrize all such solutions. 
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Exercise 6.1.4. (a) Prove that the side lengths of a primitive Pythagorean triple are # 2 
(mod 4). 
(b) Given integer n > 1 with n 4 2 (mod 4), explicitly give a primitive Pythagorean triple 
which has n as a side length. 


Exercise 6.1.5.1 Prove that there are infinitely many triples of coprime squares in arithmetic 
progressions. 


Around 1637, Pierre de Fermat was studying the proof of (6.1.1) in his copy of 
Bachet’s translation of Diophantus’s Arithmetica. In the margin he wrote: 


I have discovered a truly marvellous proof that it is impossible to separate a 
cube into two cubes, or a fourth power into two fourth powers, or in general, 
any power higher than the second into two like powers. This margin is too 
narrow to contain it. 


—PIERRE DE FERMAT (1637), in his copy of Arithmetica 


In other words, Fermat claimed that for every integer n > 3 there do not exist 
positive integers x,y, z for which 


This is known as “Fermat’s Last Theorem”. Fermat did not subsequently mention 
this problem or his truly marvellous proof elsewhere, and the proof has not, to 
date, been rediscovered, despite many efforts|}] Fermat did show that there are no 
solutions when n = 4 and we will present his proof in section [6.4] as well as some 
consequences for more general exponents n in Fermat’s Last Theorem. 


6.2. No solutions to a Diophantine equation through descent 


Some Diophantine equations can be shown to have no solutions by starting with a 
purported smallest solution and finding an even smaller one, thereby establishing 
a contradiction. Such a proof by descent can be achieved in various different ways. 


No solutions through prime divisibility 


For some equations one can perform descent by considering the divisibility of the 
variables by various primes. We now give such a proof that \/2 is irrational. 


Proof of Proposition [3.4.1] by 2-divisibility. [\/2 is irrational.| Let us recall 
that if V2 is rational, then we can write it as a/b so that a? = 2b7. Let us suppose 
that (b,a) gives the smallest solution to y? = 2x? in positive integers. Now 2 
divides 2b? = a? so that 2|a. Writing a = 2A, thus b? = 2A?, and so 2|b. Writing 
b = 2B we obtain a solution A? = 2B? where A and B are half the size of a and b, 
contradicting the assumption that (b,a) is minimal. 


Exercise 6.2.1. Show that there are no non-zero integer solutions to x? + 3y3 + 923 = 0. 


3Fermat wrote several important thoughts about number theory on his personal copy of Arithmetica, 
without proof. When he died his son, Samuel, made these available by republishing Arithmetica with 
his father’s annotations. This is the last of those claims to have been fully understood. 
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No solutions through geometric descent 


Proof of Proposition 3.4.1] by geometric descent. Again assume that 2 = 
a/b with a and b positive integers, where a is minimal. Hence a? = 2b? which gives 
rise to the smallest isosceles, right-angled triangle, OPQ with integer side lengths 
OP = 0Q =b, PQ =a and angles POQ = 90°, PQO = QPO = 45°. Now mark 
a point R which is 6 units along PQ from Q and then drop a perpendicular to meet 
OP at the point S so that SR is perpendicular to PQ. Then RPS = QPO = 45°, 
and so RSP = 180°—90°—45° = 45° by considering the angles in the triangle RSP. 
Therefore RSP is a smaller isosceles, right-angled triangle than OPQ. Moreover 
we have side lengths RS = PR =a — b. To establish our contradiction we need to 
show that the hypoteneuse, PS, also has integer length. 


Figure 6.3. No solutions through geometric descent. 


The two triangles, OQS and RQS, are congruent, since they both contain a 
right-angle opposite SQ and adjacent to a side of length b (OQ and RQ, respec- 
tively). Therefore OS = SR =a-—band so PS = OP—OS =b-— (a—b) = 2b-a. 
Hence RSP is a smaller isosceles, right-angled triangle than OPQ with integer side 
lengths, contradicting the assumed minimality of OPQ. 


One can write this proof more algebraically: As a? = 2b?, soa > b > a/2. Now 
(2b —a)* = a? — 4ab+ 2b? + 26? = a? —4ab+ 2b? + a? = 2(a — 5). 
However 0 < 2b—a < a, contradicting the minimality of a. 
Proof of Proposition by an analogous descent. [If d is an integer for 


which Vd is rational, then Vd is an integer.] If Vd is rational, then we can write 
it as a/b so that a? = db?. Let us suppose that (b,a) gives the smallest solution 
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to y? = dx” in positive integers. Let r be the smallest integer > db/a, so that 
® +1>r> and therefore a > ra — db > 0. Then 


(ra — db)? = da? — 2rdab + d?b? + (r? — d)a? 
= da* — 2rdab + d?b? + (r? — d)db? = d(rb—a)?. 


However 0 < ra — db < a, contradicting the minimality of a, unless ra — db = 0. In 
this case r? = d- db? /a? = d. 


6.3. Fermat’s “infinite descent” 


Fermat proved that there are no right-angled triangles with all integer sides whose 
area is a square (see exercise [6.3.1] below). In so doing he developed the important 
technique of “infinite descent” , which we now exhibit in two related questions. (The 
reader can read the proof of only one of the two following similar theorems. They 
both lead to the same Corollary [6.4.1] ) 


Theorem 6.1. There are no solutions in non-zero integers x,y,z to 


gt +y* = 27. 


Proof. Assume that there is a solution and let x,y,z be the solution in positive 
integers with z minimal. We may assume that gcd(a,y) = 1 or else we can divide 
the equation through by the fourth power of gcd(a, y) to obtain a smaller solution. 
Here we have 

(2°)? + (y2)? = 22 with ged(x?,y) = 1, 
and so, by (6.1.1), there exist integers r,s with (r,s) = 1 and r+ odd such that 


g?=2rs, yr=r*—s*, and z=r?+38? 


(swapping the roles of z and y if necessary to ensure that x is even). Now r 
and s have the same sign since rs = «7/2, so we may assume they are both > 0 
(multiplying each by —1 if necessary). Now s? + y? =r? with y odd and (r,s) =1 
and so, by (61.1), there exist integers a,b with (a,b) = 1 and a+ b odd such that 


s=2ab, y=a*-b*, and r=a’?+}*, 


and so 

ax? = 2rs = 4ab(a? + 6”). 
Now a and b have the same sign since ab = s/2 > 0, and therefore we may assume 
they are both > 0 (multiplying each by —1 if necessary). 


Now a, b, and a? + b? are pairwise coprime positive integers whose product is 

a square so they must each be squares by exercise B.3.7(b). Write a = u?, b = v”, 
and a? + b? = w? for some positive integers u,v, w. Therefore 

ui tet =a? +0? = wy? 


yields another solution to the original equation. We wish to compare this to the 
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solution (a, y, 2) we started with. We find that 


wewea@4+Rar<r?4+s? =z, 


contradicting the minimality of z. 


Theorem 6.2. There are no solutions in positive integers x,y,z to 

gy = 2. 
Proof. If there is a solution, take the one with « minimal. We may assume (2, y) = 
1 or else we divide through by the fourth power of the common factor. 


We begin by noting that 
(y?)? + 2? = (x)? with ged(a?,y”) = 1. 
If z is even, then, by (6.L.J), there exist integers X,Y with (X,Y) = 1, of opposite 
parity, for which 


og? = X*4+Y? and y2=X?-Y?, sothat X*—Y* = (ay). 
Now X? < 2”, contradicting the minimality of x. 


Therefore z is odd. By (6.1.1) there exist integers r, s with (r,s) = 1, of opposite 


parity, for which 

xg? =r?+s7 and y* = 2rs. 
Now r and s have the same sign since rs = y?/2 > 0, and therefore we may 
assume they are both > 0 (multiplying each by —1 if necessary). From the equation 


2rs = y” we deduce that r = 2R?,s = Z? for some integers R, Z (swapping the 
roles of r and s, if necessary). From (6.1.1) applied to the equation r? + s? = 2”, 
there exist integers u,v with (u,v) = 1, of opposite parity, for which r = 2uv and 
s = u2—v?. Now uv = r/2 = R?, so we may assume they are both positive 
(multiplying each by —1 if necessary), and so u = m?,v = n? for some integers 


m,n. Therefore 


ment =w-v=s= 2. 


Now m? < (mn)? = uv = 1/2 < #/2, contradicting the minimality of x. 


Exercise 6.3.1 (Fermat, 1659). 
(a)t Prove that there is no right-angled, integer-sided, triangle whose area is a square. 


(b) Deduce that there is no right-angled, rational-sided, triangle whose area is 1. 


(c) Deduce that there are no integer solutions to 2+ + 4y+ = z?. 


In appendix 6B we will see an alternative proof of these results using classical Greek geometry. 


6.4. Fermat’s Last Theorem 


Fermat’s Last Theorem is the assertion that for every integer n > 3 there do not 
exist positive integers x,y, z for which 


gg” + y” _ Z”. 


Corollary 6.4.1 (Fermat). There are no solutions in non-zero integers x,y,z to 


z*+y* = z*. 


Exercise 6.4.1. Prove this using Theorem or Theorem 
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We deduce that Fermat’s Last Theorem holds for all exponents n > 3 if it holds 
for all odd prime exponents: 


Proposition 6.4.1. If Fermat’s Last Theorem is false, then there exists an odd 
prime p and pairwise coprime non-zero integers x,y,z such that 


ay P= 0. 


Proof. Suppose that «”+y" = 2” with x,y,z >0andn > 3. If two of x, y, and z 
have a common factor, then it must divide the third and so we can divide out the 
common factor. Hence we may assume that x,y,z are pairwise coprime positive 
integers. Now any integer n > 3 has a factor m which is either = 4 or is an odd 
prime (see exercise [3.1.3(b)). Hence, if n = dm, then (x%)™ + (y2)™ = (24)™, so 
we get a solution to Fermat’s Last Theorem with exponent m. We can rule out 
m = 4by Corollary [6.4.1] Therefore m = p is an odd prime and we have the desired 
solution (x7)? + (y%)? + (—z4)? =0. 


A brief history of equation solving 


There have been many attempts to prove Fermat’s Last Theorem, inspiring the 
development of much great mathematics, for example, ideal theory (see appendices 
3D and 12B). We will discuss one beautiful advance due to Sophie Germain from 
the beginning of the 19th century (see section [7.27] of appendix 7F). 


In 1994 Andrew Wiles proved Fermat’s Last Theorem, developing ideas of Frey, 
Ribet, and Serre involving modular forms, a subject far removed from the original 
question. The proof is extraordinarily deep, involving some of the most profound 
themes in arithmetic geometry/4| If the whole proof were written in the leisurely 
style of, say, this book, it would probably take a couple of thousand pages. This 
could not be the proof that Fermat believed that he had—could Fermat have been 
correct? Could there be a short, elementary, marvelous proof still waiting to be 
found? Or will Fermat’s claim always remain a mystery? 


To some extent one can measure the difficulty of solving Diophantine equations 
(especially rational solutions to equations with two variables) by their degree[}| The 
first three chapters of this book focus on linear (degree-one) equations, culminating 
in section [3.6] Much of the rest of this book provides tools for studying degree-two 
(quadratic) equations; see chapters 8 and 9, sections 11.2 and 11.3, and chapter 
12. Degree-three (cubic) equations give rise to elliptic curves; many of the key 
questions about elliptic curves lay shrouded in mystery and so they are intensively 
researched in number theory today (see chapter 17). In 1983 Gerd Faltings showed 
that higher-degree Diophantine equations only have finitely many rational solutions 
(though not how to find those solutions). 

For higher-degree equations perhaps the most interesting cases are Diophantine 
equations with varying degree, like the Fermat equation. Another famous example 
is Catalan’s conjecture: The positive integer powers are 


1,4, 8,9, 16, 25, 27, 32, 36, 49, 64,... 
4See our sequel for some discussion of the ideas involved in the proof. 


5A better but more sophisticated invariant is the genus, which requires quite a bit of algebraic 
geometry to define and is beyond the scope of this book. 
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which seem to get wider spread out as they get larger. Only two of the numbers in 
our list, 8 and 9, differ by 1, and Catalan conjectured that this is the only example 
of powers differing by 1. That is, the only integer solution to 


x? —y? =1 with «,y 40 and p,q > 2, 
is 3? — 23 = 1. This was shown to be true by Preda Mihiailescu in 2002. 


Combining these two famous equations leads to the Fermat-Catalan equation 


x? + y? = 2" where (x,y,z) = 1 and a + 2 <1. 
Pq 
We insist that (x,y,z) = 1 because one can find “trivial” solutions like 2* + 2* = 
2*+1 in many cases (see exercise [6.5.8] for more examples). Obviously there are 
solutions when one of p,q,r is 1, so we insist they are all > 2. One can find 
solutions when two of the exponents equal 2, and so the peculiar looking condition 
+ : + + <1 turns out to be the correct one. We do know of ten solutions: 


Tle 


1422 =37, 247 =34, 7413? =29, 274177 =71?, 3° +4114 = 1227, 
17’ + 76271° = 210639287, 14143 + 22134597 = 65", 9262? + 15312283? = 113”, 


43° + 96222? = 300429077, 33° + 1549034? = 15613. 


It is conjectured that there are only finitely many solutions z?, y4, z” to the Fermat- 
Catalan equation; perhaps these ten are all the solutions. All of our ten solutions 
have an exponent equal to 2. So one might further conjecture that there are no 
solutions to the Fermat-Catalan equation with p,q,r all > 2. These are open 
questions and mathematicians are making headway. Henri Darmon and I proved in 
1995 that there are only finitely many solutions for each fixed triple p, q,r. Today we 
know that for various infinite families exponent triples p,q,r, the Fermat-Catalan 
equation has no solutions: For example when p = q and - + + 4 < 1 there are 
no solutions if r is divisible by 2 or by 3 or by p, or if p is even and r is divisible by 
5, etc. (see [1] for the state of the art). 


Now that Fermat’s Last Theorem has been proved, what can take its place 
as the “holy grail” of Diophantine equations? The abc-conjecture is clearly an 
important problem that would have profound effects on equations and even in other 
areas of number theory. In appendix 6A we will discuss its analogy for polynomials 
and then discuss the abc-conjecture itself and its influence on other equations, in 


section [11.5] 
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Additional exercises 


Exercise 6.5.1. Find all rational-sided right-angled triangles in which the area equals the perime- 
ter. Prove that 5,12,13 and 6,8, 10 are the only such integer-sided triangles. 
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Exercise 6.5.2.' Let n be an integer > 2 that is # 2 (mod 4). Prove that there are 2#(™—1 dis- 
tinct primitive Pythagorean triangles in which n is the length of a side which is not the hypotenuse, 
where w(n) counts the number of distinct prime factors of n. 


Exercise 6.5.3.1 Find a 1-to-1 correspondence between pairs of integers b,c > 0 for which 
x? — br —c and x? — br + ¢ are both factorable over Z, and right-angled triangles in which all 
three sides are integers. 


Exercise 6.5.4. Prove that if f(x) € Z[a] is a quadratic polynomial for which f(x) and f(x) +1 
both have integer roots, then f(a) + 1 is the square of a linear polynomial. (Try substituting the 
roots of f(x) into f(a) +1 and studying divisibilities of the differences of the roots.) 


Exercise 6.5.5.1 We wish to show that a = ve is irrational. Suppose it is rational, so that 
a = p/q with (p,q) = 1. Now a satisfies the equation #? = a + 1, so dividing through by x we 
have x = (1+ 2)/z, and so a = (p+q)/p. Prove that p/gq cannot equal (p + q)/p and therefore 
establish a contradiction. 


Exercise 6.5.6. Generalize the proof in the last exercise, to prove that if a is a rational root of 
x? — ax —b € Z[z], then a is an integer which divides b. 


Exercise 6.5.7. Prove that 2n is the length of the perimeter of a right-angled integer-sided 
triangle if and only if there exist divisors d;, dz of n for which d, < dz < 2d,. 


Exercise 6.5.8. Suppose that integers p, q,r are given. For any integers a and b let c = aP +b9. If 
we multiply this through by c”, where n is divisible by p and q, then (ac”/?)? + (be"/4)4 = c?1, 
Determine conditions on p, q, and r under which we find an integer n such that c”t+! is an rth 
power (and therefore find an integer solution to x? + y? = 2”, albeit with (x,y,z) > 1). 


Exercise 6.5.9. Calculations show that every integer in [129,300] is the sum of distinct squares. 
Deduce that every integer > 128 is the sum of distinct squares. (In exercise [2.5.6[f) we showed 
that there are infinitely many integers that cannot be written as the sum of three squares. In 
appendix 12E we will show that every integer is the sum of four squares.) 


Exercise 6.5.10. Prove that there are infinitely many integers that cannot be written as the sum 
of three cubes. 


Exercise 6.5.11.! Calculations show that every integer in [12759, 30000] is the sum of distinct 
cubes of positive integers. Deduce that every integer > 12758 is the sum of distinct cubes of 
positive integers. (In 2015 Siksek showed that every integer > 454 is the sum of at most seven 
positive cubes. It is believed, but not proven, that every sufficiently large integer is the sum of at 
most four positive cubes.) 


Exercise 6.5.12. Verify the identity 6x = (a + 1)? + (« — 1)? — 2x3. Deduce that every prime is 
the sum of no more than five cubes of integers (which can be positive or negative). 


Exercise 6.5.13. (a) Prove that n+ =0 or 1 (mod 16) for all integers n. 
Let N be divisible by 16. 
(b) Show that if N is the sum of 15 fourth powers, then each of those fourth powers is even. 
(c) Deduce that N is the sum of 15 fourth powers if and only if N/16 is the sum of 15 fourth 
powers. 
(d) Prove that 31 is not the sum of 15 fourth powers but is the sum of 16 fourth powers. 
(e) Deduce that there are infinitely positive integers N that are not the sum of 15 fourth powers. 


(In 2005, Deshouillers, Kawada, and Wooley showed that every integer > 13792 can be 
written as the sum of 16 fourth powers.) 


In 1770 Waring asked whether for all integers k there exists an integer g(k) 
such that every positive integer is the sum of at most g(k) kth powers of positive 
integers. This was proved by Hilbert in 1909 but it is still a challenge to evaluate 
the smallest possible g(&) for each k. We discuss this further in appendix 17D. 


Appendix 6A. Polynomial 
solutions of Diophantine 
equations 


6.6. Fermat’s Last Theorem in C{t] 


The notation C[{¢] denotes polynomials whose coefficients are complex numbers. In 
section[6.1]we saw that all integer solutions to x? + y? = 2? are derived from letting 
t be a rational number in the polynomial solution 


(? —1)? + (28)? = (+1). 


We now prove that there are no “genuine” polynomial solutions to Fermat’s equa- 
tion 


(6.6.1) zh ty? = 2P 


with exponent p larger than 2 (where by genuine we mean that (x(t), y(t), z(t)) is 
not a polynomial multiple of a solution of (6.6.1) in complex numbers). 


Proposition 6.6.1. There are no genuine polynomial solutions x(t), y(t), z(t) € 
C[t] to x(t)? + y(t)? = z(t)? with p > 3. 


Proof. Assume that there is a solution with x, y, and z all non-zero to (6.6.1) 
where p > 3. We may assume that x, y, and z have no common (polynomial) 
factor or else we can divide out by that factor (and that they are pairwise coprime 
by the same argument as in section [6.1). Our first step will be to differentiate 
to get 


pa? he! 4 py? ty! _ p2P-1z! 


and after dividing out the common factor p, this leaves us with 


(6.6.2) gel g! +yP-ty! = P12, 
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We now have two linear equations (6.6.1) and (6.6.2) (thinking of 2?~', y?~!, and 
2?! as our variables), which suggests we use linear algebra to eliminate a variable: 


Multiply (66.1) by y’ and (6.6.2) by y, and subtract, to get 

wP—*(ay! — yx’) = a? (ay! — ya") + yh (yy! — yy!) = 2? * (zy! — yz’). 
Therefore x?~! divides z?~!(zy! — yz’), but since 2 and z have no common factors, 
this implies that 
(6.6.3) a?! divides zy! — yz’. 
This is a little surprising, for if zy’ — yz’ is non-zero, then a high power of x divides 
zy’ — yz’, something that does not seem consistent with (6.6.1). 

Now, if zy’ — yz’ = 0, then (y/z)’ = 0 and so y is a constant multiple of z, 
contradicting our statement that y and z have no common factor. Therefore (6.6.3) 
implies, taking degrees of both sides, that 

(p—1) degree(x) < degree(zy’ — yz’) < degree(y) + degree(z) — 1, 
since degree(y’) = degree(y) — 1 and degree(z’) = degree(z) — 1. Adding degree(z) 
to both sides gives 
(6.6.4) p degree(x) < degree(x) + degree(y) + degree(z). 


The right side of (6.6.4) is symmetric in x, y, and z. The left side is a function of 
x simply because of the order in which we chose to do things above. We could just 
as easily have derived the same statement with y or z in place of x on the left side 


of (6.6.4), so that 
p degree(y) < degree(x) + degree(y) + degree(z) 
and p degree(z) < degree(x) + degree(y) + degree(z). 
Adding these last three equations together and then dividing out by degree(a) + 
degree(y) + degree(z) implies 
p<3, 
and so Fermat’s Last Theorem is proved, at least for polynomials. 


That Fermat’s Last Theorem is not difficult to prove for polynomials is an old 
result, going back certainly as far as Liouville in 1851. 


Exercise 6.6.1. Prove that all solutions to x(t)? + y(t)? = z(t)? in polynomials are a scalar 
multiple of some solution of the form (r(t)? — s(t)?)? + (2r(t)s(t))? = (r(t)? + s(t)?)?. 


6.7. a+b=c in C{t] 


We now intend to extend the idea in our proof of Fermat’s Last Theorem for 
polynomials to as wide a range of questions as possible. It takes a certain genius 
to generalize to something far simpler than the original. But what could possibly 
be more simply stated, yet more general, than Fermat’s Last Theorem? It was 
Richard C. Mason (1983) who gave us that insight: Look for solutions to 


at+b=c. 


We will just follow through the above proof of Fermat’s Last Theorem for polyno- 
mials (Proposition [6.6.1) and see where it leads: Start by assuming, with no loss 
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of generality, that a, b, and c are all non-zero polynomials without common factors 
(or else all three share the common factor and we can divide it out). Then we 
differentiate to get 
a+b=c. 

Next we need to do linear algebra. It is not quite so obvious how to proceed 
analogously, but what we do learn in a linear algebra course is to put our coefficients 
in a matrix and solutions follow if the determinant is non-zero. This suggests 
defining 


_ fat) v0) 
AO = lat) mer 


Then if we add the first column to the second, we get 


a(t) — e(t) 
A(t) = a'(t) c(t) ) 
and similarly 
NREECOMETO) 


by adding the second column to the first, a beautiful symmetry. 

We note that A(t) 4 0, or else ab’ — a'b = 0 so b is a scalar multiple of a (with 
the same argument as above), contradicting our hypothesis. 

To find the appropriate analogy to (6.6.3), we consider the power to which 
the factors of a (as well as b and c) divide our determinant: Let a be a root of 
a(t), and suppose that (t — a)® is the highest power of (t — a) which divides a(t) 
(we write (¢t — a)°||a(t)). Now we can write a(t) = U(t)(t — a)® where U(t) is a 
polynomial that is not divisible by (t — a), so that a’(t) = (t — a)*~'V(t) where 
V(t) :-= U'(t)(t — a) + eU(t). Now (t —a,V(t)) = (t-—a,eU(t)) = 1, and so 
(t —a)*1||a’(t). Therefore 

A(t) = a(t)b’(t) — a’ (t)b(t) = (t — a)* W(t) 
where W(t) := U(t)(t — a)b'(t) — V(#)b(t) and (t— a, W(t)) = (t—a, V(t)d(t)) = 1 
as t — a does not divide b(t) or V(t). Therefore we have proved that 
(t— a) "|A@). 
This implies that (t—a)° divides A(t)(¢—a@). Multiplying all such (t— a@)° together 
we obtain (since they are pairwise coprime) that 
a(t) divides A(t) [J (t- a). 
a(a)=0 
In fact a(t) only appears on the left side of this equation because we studied the 
linear factors of a; analogous statements for b(t) and c(t) are also true, and since 
a(t), b(t), c(t) have no common roots, we can combine those statements to read 
(6.7.1) a(t)b(t)c(t) divides A(t) |] (t-a). 
(abc) (a)=0 


The next step is to take the degrees of both sides of (6.7.1). The degree of 
TT abey(a)=0(t — @) is precisely the total number of distinct roots of a(t)b(t)c(t). 
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Therefore 
degree(a) + degree(b) + degree(c) < degree(A) + #{a € C: (abc)(a) = 0}. 
Now, using the three different representations of A above, we have 
degree(a) + degree(b) — 1, 
degree(A) < 4 degree(a) + degree(c) — 1, 
degree(c) + degree(b) — 1. 
Inserting all this into the previous inequality we get 
degree(a), degree(b), degree(c) < #{a € C: (abc)(a) = 0}. 


Put another way, this result can be read as: 


Theorem 6.3 (The abc Theorem for Polynomials). [f a(t), b(t), c(t) € C[t] do not 
have any common roots and provide a genuine polynomial solution to a(t)+b(t) = 
c(t), then the maximum of the degrees of a(t), b(t), c(t) is less than the number of 
distinct roots of a(t)b(t)c(t) = 0. 


This is a “best possible” result in that we can find infinitely many examples 
where there is exactly one more zero of a(t)b(t)c(t) = 0 than the largest of the 
degrees, for example the familiar identity 


(2t)? + (t? —1)? = (#7 +1)”; 
or the rather less interesting 
t?+1=(t"+4+1). 


Exercise 6.7.1. Let a, b, and c be given non-zero integers, and suppose n,p,q,r > 1. 

(a) Prove that there are no genuine polynomial solutions 2(t), y(t), z(t) to aw” + by” = cz” 
with n> 3. 

(b) Prove that if there is a genuine polynomial solution x(t), y(t), z(t) to ax? + by? = cz” in 
which x, y, and z have no common root, then ‘ + < + + >1. 

(c) Deduce in (b) that this implies that at least one of p, g, and r must equal 2. 

(d) One can find solutions in (b) if one allows common factors, for example x? + y? = z4 where 
x = t(t? +1) and y = z =t9 +1. Generalize this construction to as many other sets of 
exponents p, g, Tr as you can. (Try to go beyond the construction in exercise[6.5.8]) 


Exercise 6.7.2. Let a and 6 be given non-zero integers, p,q > 1, and x(t), y(t) € C[t]. Let D be 
the maximum of the degrees of z? and y?, and assume that ax? + by? 4 0. 
(a) Prove that the degree of ax? + by? is > D(1— ‘ = =e 
(b)? Prove that if g = (p,q) > 1, then the degree of ax? + by? is > D/g. 
(c) Deduce that the degree of ax? + by? is always > D/6. 
(This is “best possible” in the case (¢? + 2)3 — (t3 + 3t)? = 3t? +8.) 


Appendices. The extended version of chapter 6 has the following additional 
appendices: 

Appendix 6B. No Pythagorean triangle of square area via Euclidean geometry 
presents another proof (due to a student, Stephanie Chan, in 2017) of this theorem 
of Fermat, now via clever geometric manipulations. 

Appendix 6C. Can a binomial coefficient be a square? addresses and resolves 
the question of whether a binomial coefficient can be a square. 


Chapter 7 


Power residues 


We begin by calculating the least residues of the small powers of each given residue 
mod m, to look for interesting patterns: 


a’ | ala? }a®|a*t}a 
Oe a" 
1/0] 0;0]0] 0 
0] 0 
1TJ/i} il} iyiy tl 
1} 1 
1)/2})/1),2)] 1) 2 
Least power residues (mod 2). Least power residues (mod 3)H 


In these small examples, the columns soon settle into repeating patterns as we go 


from left to right: For example, in the mod 3 case, the columns alternate between 
0,1,1 and 0,1,2. How about for slightly larger moduli? 
Pla let lot lot las a® | ala? | a? | a* | a 
1/0;0;0]0/)]0 
be cea he ly viafafalala 
1/2}; 4]3 41) 2 
aes aoc ea ea 1/3}/4),2)1)3 
1}/3)} 1 ]3) 14) 3 ohsae ae ge W ate Ih 
Least power residues (mod 4). Least power residues (mod 5). 


‘Why did we take 0° to be 1 (mod m) for m = 2, 3, 4, and 5? In mathematics we create symbols 
and protocols (like taking powers) to represent numbers and actions on those numbers, and then we 
need to be able to interpret all combinations of those symbols and protocols. Occasionally some of those 
combinations do not have an immediate interpretation, for example 0°. So how do we deal with this? 
Usually mathematicians develop a convenient interpretation that allows that not-well-defined use of a 
protocol to nonetheless be consistent with the many appropriate uses of the protocol. Therefore, for 
example, we let 0° be 1, because it is true that a® = 1 for every non-zero number a, so it makes sense 
(and is often convenient) to define this to also be so for a = 0. 

Perhaps the best known dilemma of this sort comes in asking whether oo is a number. The correct 
answer is “No, it is a symbol” (representing an upper bound on the set of real numbers) but it is certainly 
convenient to treat it as a number in many situations. 


123 


124 7. Power residues 


Again the patterns repeat, every second power mod 4, and every fourth power mod 
5. Our goal in this chapter is to understand the power residues, and in particular 
when we get these repeated patterns. 


7.1. Generating the multiplicative group of residues 


We begin by verifying that for each coprime pair of integers a and m, the power 
residues do repeat periodically: 


Lemma 7.1.1. For any integer a, with (a,m) = 1, there exists an integer k, 
1<k< (m), for which a®¥ =1 (mod m). 


Proof. Each term of the sequence 1,a,a?,a?,... is coprime with m by exercise 
[3.3.51 But then each is congruent to some element from any given reduced set of 
residues mod m (which has size ¢(m)). Therefore, by the pigeonhole principle, 
there exist i and j with 0 <i <j < ¢(m) for which a’ = a (mod m). 

Next we divide both sides of this equation by a’. To justify doing this we 
observe that (a’,m) = 1 (as (a,m) = 1) and so we can use Corollary [3.5.1]to obtain 
our result with k = j —i, so that 1<k < d(m). 


Exercise 7.1.1. (a) Show that for any integers a and m > 2, there exist integers 7 and k, with 
0<i<m-—land1<k<m-—isuch that a”+* =a” (mod m) for every n> i. 
(b) For each integer m > 2 determine an integer a such that a # 1 (mod m) but a? = a 
(mod m). (This explains why we need the hypothesis that (a,m) = 1 in Lemma[?Z.1.1]) 


Another proof of Corollary [If (a,m) = 1, then a has an inverse mod 
m.] Let r = a*—! so that ar = a* =1 (mod m). 


Examples. In the geometric progression 2,4, 8,..., the first term = 1 (mod 13) is 
212 = 4096. The first term = 1 (mod 23) is 211 = 2048. Similarly 5° = 15625 = 1 
(mod 7) but 5° = 1 (mod 11). We see that in some cases the power needed is as 
big as ¢(p) = p—1, the bound given by Lemma[?.1.]] but not always. 

If a* = 1 (mod m), then a*+/ = aJ (mod m) for all j > 0, and so the geometric 
progression a°, a+, a?,...modulo m has period k. Thus if u =v (mod k), then a” = 
a” (mod m). Therefore one can easily determine the residues of powers (mod m). 
For example, to compute 3!9°° (mod 13), first note that 3° = 1 (mod 13). Now 
1000 = 1 (mod 3), and so 310° = 3! = 3 (mod 18). 


If (a,m) = 1, then let ord,,(a), the order of a (mod m), denote the smallest 
positive integer k for which a* = 1 (mod m). We know that there must be such an 
integer, by Lemma[¥.1.1] We have ord3(2) = ord4(3) = 2, ords(2) = ords(3) = 4 
(from the tables above), and ord;3(2) = 12, ordg3(2) = 11, ord7(5) = 6, and 
ord;1(5) = 5 from the examples above. The powers of 3 (mod 16) are 1,3,9,3° = 
11,34 = 1,3,9,11,1,3,9,11,1,... so that the residues are periodic with period 
ordi (3) =A; 


Lemma 7.1.2. Suppose that a and m are coprime integers with m > 1. Then n is 
an integer for which a” = 1 (mod m) if and only if ordm(a) divides n. 
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Proof. Let k := ord;,(a) so that a* = 1 (mod m). Suppose that n is an integer 
for which a” = 1 (mod m). There exist integers gq and r such that n = gk+r where 
0<r<k-1. Hence a” = a"/(a*)4 =1/12=1 (mod m). Therefore r = 0 by the 
minimality of & (from the definition of order), and so k divides n as claimed. 


In the other direction, if k divides n, then a” = (a*)"/* = 1 (mod m). 


Exercise 7.1.2. Let k := ordm(a) where (a,m) = 1. 
(a) Show that 1,a,a?,...,a*—1+ are distinct (mod m). 
(b) Deduce that aJ = a (mod m) if and only if 7 =i (mod k). 


We see that ord (a) is the smallest period of the sequence 1,a,a?,... (mod m). 


We wish to understand the possible values of ord(a), especially for fixed m, 
as a varies over integers coprime to m. We begin by taking m = p prime. The 
theory for composite m can be deduced from an understanding of the prime power 
modulus case, using the Chinese Remainder Theorem as determined in detail in 
section 7.18 of appendix 7B. 


Theorem 7.1. [fp is a prime and p does not divide a, then ordp(a) divides p—1. 


Proof. Let k := ord ,(a) and A= {1,a,a?,...,a*~! (mod p)}. For any non-zero b 
(mod p) define the set bA = {ba (mod p): a € A}. 

Let b and b’ be any two reduced residues mod p. We now show that either bA 
and b'A are disjoint or they are equal: If they have an element, c, in common, then 
there exists 0 < i,j <k—1 such that ba’ = c=0/a! (mod p). Therefore b/ = ba” 
(mod p) where h is the least non-negative residue of i — 7 (mod k). Hence 


i ba?*£ (mod p) if0<@<k-1-h, 
——— 
batt#-k (mod p) ifk-—h<l<k-1, 


which implies that b’A C bA. Since the two sets are finite and of the same size they 
must be identical. 


Since any two sets of the form bA are either identical or disjoint, we deduce 
that they partition the non-zero elements mod p. That is, the reduced residues 
1,...,p—1 (mod p) may be partitioned into disjoint cosets bA, of A, each of which 
has size |A|; and therefore |A| = k divides p — 1. 


To highlight this proof let a =5 and p= 13 so that A = {1, 5, 5? = 12, 5° 
8 (mod 13)}. Then the cosets A, 2A = {2, 10, 11, 3 (mod 13)}, and 4A 
{4, 7, 9, 6 (mod 13)} partition the reduced residues mod 13, and therefore 3|A| = 
12. Also note that 7A = {7, 9, 6, 4 (mod 13)} = 4A, as claimed, the same residues 
but in a rotated order. 


7.2. Fermat’s Little Theorem 


Theorem [7.1] limits the possible values of ord,(a). The beauty of the proof of 
Theorem [7.1] which is taken from Gauss’s Disquisitiones Arithmeticae, is that. it 
works in any finite group, as we will see in Proposition[7.22.1]of appendix 7b This 


2 What is especially remarkable is that Gauss produced this surprising proof before anyone had 
thought up the abstract notion of a group! 
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result leads us directly to one of the great results of elementary number theory, first 
observed by Fermat in a letter to Frénicle on October 18, 1640: 


Theorem 7.2 (Fermat’s “Little” Theorem). If p is a prime and a is an integer 
that is not divisible by p, then 


p divides a?~' —1. 


Proof. We know that ord,(a) divides p—1 by Theorem[7.]] and therefore a?~! = 1 
(mod p) by Lemma[?.1.2] 


Here is a useful reformulation of Fermat’s “Little” Theorem: 
Fermat’s Little Theorem, v2. [fp is a prime and a is a positive integer, then 
p divides a? —a. 
Exercise 7.2.1. Prove that our two versions of Fermat’s Little Theorem are equivalent to each 


other (that is, easily imply one another). 


We now present several different proofs of Fermat’s “Little” Theorem and then 
a surprising proof in appendix 7A. 


“Sets of reduced residues” proof. In exercise B.5.2] we proved that {a - 1, 
a-2,...,a-(p—1)} form a reduced set of residues mod p. The residues of these 
integers mod p are therefore the same as the residues of {1,2,...,p — 1} although 
in a different order. Since the two sets are the same mod p, the products of the 
elements of each set are equal mod p, and so 


(a-1)(a-2)---(a-(p—1)) =1-2---(p—1) (mod p); 
that is, 
aP-*. (p—1)!=(p—1)! (mod p). 


As (p, (p — 1)!) = 1, we can divide the (p — 1)! out from both sides to obtain the 
desired 


a?-'=1 (mod p). 


Euler’s 1741 proof. We shall show that a? — a is divisible by p for every integer 
a> 1. We proceed by induction on a: For a = 1 we have 1?~! — 1 = 0, and so the 
result is trivial. Otherwise, by the binomial theorem, 


(asp —ar-1= 5 (*ai =o (mod p), 


i=1 


as p divides the numerator but not the denominator of @ ) for each 1,1 <i<p-1 
(as in exercise 2.5.8). Reorganizing we obtain 


(a +1)? — (a+1) = (a? +1) —- (a+ 1) =a?-a=0 (mod p), 


the last congruence following from the induction hypothesis. 
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Combinatorial proof. The numerator, but not the denominator, of the multi- 
nomial coefficient ¢ ik = is divisible by p unless one of 7, 7,k,... equals p and the 
others equal 0. In this case the multinomial coefficient equals 1. Therefore, by the 


multinomial theorem] 


(at+tb+c+-:-)P=aP+bP4+cP+--- (mod p). 


I 


Taking a=b=c=---=1 gives (? = é (mod p) for all integers @ > 1. 


Another proof of Theorem Theorem[7.1]follows from Fermat’s Little The- 
orem and Lemma[Z.1.2] with m = p and n = p—1. (This is not a circular argument 
as our last three proofs of Fermat’s Little Theorem do not use Theorem [7.1}) 


We can use Fermat’s Little Theorem to help quickly determine large pow- 
ers in modular arithmetic. For example for 2'0°°°°! (mod 31), we have 22° = 1 
(mod 31) by Fermat’s Little Theorem, and so, as 1000001 = 11 (mod 30), we 
obtain 21000001 = 911 (mod 31) and it remains to do the final calculation. How- 
ever, using the order makes this calculation significantly easier: Since ord3;(2) = 5 
we have 2° = 1 (mod 31) and therefore, as 1000001 = 1 (mod 5), we obtain 
21000001 = 21 = 2 (mod 31). 


It is worth stating the converse to Fermat’s Little Theorem: 


Corollary 7.2.1. If (a,n) =1 anda"! #1 (mod n), then n is composite. 


For example (2,15) = 1 and 2+ = 16 = 1 (mod 15) so that 214 = 2? = 4 
(mod 15). Hence 15 is a composite number. The surprise here is that we have 
proved that 15 is composite without having to factor 15. Indeed whenever Corollary 
[7.2.llis applicable we will not have to factor n to show that it is composite. This is 
important because we do not know a fast way to factor an arbitrarily large integer 
n, but one can compute rapidly with Corollary (as discussed in section 7.13 
of appendix 7A). We will discuss such compositeness tests in section 


Exercise 7.2.2. Prove that for any m > 1 if (a,m) = 1, then ord,,(a) divides ¢(m) (by an 
analogous proof to that of Theorem [7.1). 


Theorem 7.3 (Euler’s Theorem). For any m > 1 if (a,m) = 1, then a®™ =1 


(mod m). 


Proof. By definition a4‘) = 1 (mod m). By exercise there exists an 
integer k for which ¢(m) = k ordm(a) and so a%(™ = (a2%4m()* = 1 (mod m). 


This result and proof generalizes even further, to any finite group, as we will 
see in Corollary [7.23.1] of appendix 7D. 


Exercise 7.2.3. Prove Euler’s Theorem using the idea in the “sets of reduced residues” proof of 
Fermat’s Little Theorem, given above. 


Exercise 7.2.4. Determine the last decimal digit of 3°49. 


3For the reader who has seen it before. 
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7.3. Special primes and orders 

We now look at prime divisors of the Mersenne and Fermat numbers using our 
results on orders. 

Exercise 7.3.1. Show that if p is prime and q is a prime dividing 2? — 1, then ordg(2) = p. 


Hence, by exercise [7.3.1] if g divides 2? — 1, then p = ordg(q) divides g — 1 by 
Theorem [7.1] that is, q=1 (mod p). 


Another proof that there are infinitely many primes. If p is the largest 
prime, let q be a prime factor of 2? — 1. We have just seen that p divides g — 1, so 
that p< q—1 <q. This contradicts the assumption that p is the largest prime. 


Exercise 7.3.2.1 Show that if prime p divides F, = 2?” +1, then ordp(2) = 2"+1!. Deduce that 
p=1 (mod 2”+1), 


Theorem 7.4. Fir k > 2. There are infinitely many primes = 1 (mod 2*), 


Proof. If p, is a prime factor of F, = 2?” +1, then p, = 1 (mod 2*) for all 
n >k-—1, by exercise [7.3.2] We saw that the p, are all distinct in section B.1] 


7.4. Further observations 


Lemma a weak form of the Fundamental Theorem of Algebra (Theorem[3.11), 
states that any polynomial in C[z] of degree d has at most d roots. An analogous 
result can be proved for polynomials mod p. 


Proposition 7.4.1 (Lagrange). Suppose that p is a prime and that f(a) is a non- 
zero polynomial with coefficients in Z/pZ of degree d. Then f(x) has no more than 
d roots mod p (counted with multiplicity). 


Proof. By induction on d > 0. This is trivial for d = 0. For d > 1 we will suppose 
that f(a) =0 mod p. Then write f(a) = 0“, fia? and define 


d i 
g(a) = HAL) _ 9 = 
i=0 


d 
a’ ~ 5 i : 
oo » fila’ * +ax*? +---+a"), 
i=0 


a polynomial of degree d — 1 with leading coefficient fa (so is non-zero). Therefore 
g(x) has no more than d — 1 roots mod p, by the induction hypothesis. Now 

f(@) = f(@) — fl@) = (@ — a)g(x) 
and so if f(b) = 0 (mod p), then (b — a)g(b) = 0 (mod p). Either b = a (mod p) 
or g(b) = 0 (mod p), and so f has no more than 1 + (d— 1) = d roots mod p. 


Fermat’s Little Theorem implies that 1,2,3,...,p—1 are p—1 distinct roots of 
x?~! — 1 (mod p), and are therefore all the roots, by Proposition [74.1] Therefore 
the polynomials x?~' — 1 and (x — 1)(a — 2)---(x — (p—1)) mod p are the same 
up to a multiplicative constant. Since they are both monic they must be identical; 
that is, 


(7.4.1) a? —1= (4 —1)(a@—2)---(x—(p—1)) (mod p), 
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which implies that 
x? —4% =2(x—1)(a4—2)---(w—(p—1)) (mod p). 


Theorem 7.5 (Wilson’s Theorem). For any prime p we have 


(p—1)!=-1 (mod p). 


Proof. Take x = 0 in (7.4.1), and note that (—1)?-! = 1 (mod p). 


Gauss’s proof of Wilson’s Theorem. Let S' be the set of pairs (a,b) for which 
1<a<b< pand ab = 1 (mod p); that is, every residue is paired up with its 
inverse unless it equals its inverse. Now if a= a‘ (mod p), then a? = 1 (mod p), 
in which case a = 1 or p— 1 (mod p) by Lemma[B.8.1] Therefore 


1-2---(p—1l)=1-( - |] ab=1- - {I 1=-1 (mod p). 


(a,b)ES (a,b)ES 


Example. For p = 13 we have 
12! = 12(2 x 7)(3 x 9)(4 x 10)(5 x 8)(6 x 11) =—-1-1-1-1-1-1=-—1 (mod 13). 


Exercise 7.4.1. (a) Show that if n > 4 is composite, then n divides (n — 1)!. 
(b) Show that n > 2 is prime if and only if n divides (n — 1)!4+1. 


Combining Wilson’s Theorem with the last exercise we have an indirect pri- 
mality test for integers n > 2: Compute (n — 1)! (mod n). If it is = —1 (mod n), 
then n is prime; if it is = 0 (mod n), then n is composite. Note however that in 
determining (n — 1)! we need to do n— 2 multiplications, so that this primality test 
takes far more steps than trial division (see section [5.2)! 


Exercise 7.4.2. (a) Use the idea in Gauss’s proof of Wilson’s Theorem to show that 


II a= II b (mod n). 
l<a<n 1<b<n 
(a,n)=1 b?=1 (mod n) 


(b) Evaluate this product using exercise [3.8.3]or by pairing b with n — b. 


Exercise 7.4.3. (a) Show that ( (—1)-D/? (mod p). 


p= b1)!=1or—1 (mod p). 


(p— Pp) = 
(b) Deduce that if p= 3 (mod 4), then 


(c) Deduce that if p= 1 (mod 4), then (25+)! is a root of «2 = —1 (mod pA 


7.5. The number of elements of a given order, and primitive roots 


In Theorem [7.1] we saw that the order modulo p of any integer a which is coprime 
to p must be an integer which divides p— 1. In this section we show that for each 
divisor m of p—1, there are residue classes mod p of order m. 


4This explicitly provides a square root of —1 (mod p) which is interesting, as there is no easy way 
in general to determine square roots mod p. However we do not know how to rapidly calculate the least 


residue of (23): (mod p). 
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Example. For the primes p = 13 and p = 19 we have 


Order | a (mod 13) Order a (mod 19) 
1 1 1 1 
2 12 2 18 
3 3, 9 3 7, 11 
4 5, 8 6 8, 12 
6 4, 10 9 4, 5, 6, 9, 16, 17 
if | 067,11 18 | 2,3, 10, 13, 14, 15 


How many residues are there of each order? From these examples we might guess 
the following result. 


Theorem 7.6. If m divides p—1, then there are exactly ¢(m) elements a (mod p) 
of order m. 


A primitive root a mod p is a reduced residue mod p of order p— 1. The least 
residues of the powers 


2 


1,a,a?, a®, ..., a?~* (mod p) 


are distinct reduced residues by exercise and so must equal 

1,2,...,p—1 
in some order. Therefore every reduced residue is congruent to some power a’ 
(mod p) of a, and the power 7 can be reduced mod p— 1. For example, 2, 3, 10, 


13, 14, and 15 are the primitive roots mod 19. We can verify that the powers of 3 
mod 19 yield a reduced set of residues: 


1.3 32 33 34 3° 36 37 38 39 310 gi 312 318 3i4 315 316 317 318 235 
= 1,3, 9, 8, 5,—4, 7, 2, 6,-1,-3,-9,-8,—8, 4,—7,-2, —6, 1,... (mod 19), 


respectively, so 3 is a primitive root mod p. Taking m = p— 1 in Theorem [7.6] we 
obtain the following: 


Corollary 7.5.1. For every prime p there exists a primitive root mod p. In fact 
there are d(p—1) distinct primitive roots mod p. 


To prove Theorem [7.6]it helps to first establish the following lemma: 


Lemma 7.5.1. If m divides p—1, then there are exactly m elements a (mod p) 
for which a™ = 1 (mod p). 


Proof. We saw in (Z4.1) that 
Ph ={= (2 _ ia” ite gpP-1-2m Be dst am ae 1) 


factors into distinct linear factors mod p, and therefore 7” — 1 does so also. 


The residue a (mod p) is counted in Lemma if and only if the order of a 
divides m. Now we prove Theorem [7.6] which counts the number of residue classes 
a (mod p) whose order is exactly m. 
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Proof of Theorem [7.6] Let 7(d) denote the number of elements a (mod p) of 
order d. The set of roots of 2” — 1 (mod p) is precisely the union of the sets of 
residue classes mod p of order d, over each d dividing m, so that 


(7.5.1) So v(d) =m 

d\|m 
for all positive integers m dividing p — 1, by Lemma[7.5.1] We now prove that 
v(m) = ¢(m) for all m dividing p— 1, by induction on m. The only element of 


order 1 is 1 (mod p), so that w(1) = 1 = ¢(1). For m > 1 we have w(d) = ¢(d) for 
all d < m that divide m, by the induction hypothesis. Therefore 


o(m) =m— $2 4d) =m— S° 6(d) = o(m), 
d\|m d\|m 
d<m d<m 


the last equality following from Proposition [4.1.1] The result follows. 


Although there are many primitive roots mod p (¢(p—1) of them by Theorem 
(7.6), it is not obvious how to always find one rapidly. In section [7.15] of appendix 
7B we will present Gauss’s practical algorithm for finding primitive roots (as well 


as special cases in exercises [8.9.20] [8.9.21] and [8.9.22). 


It is believed that 2 is a primitive root mod p for infinitely many primes p 
though this remains an open question. Artin’s primitive root conjecture states that 
every prime q is a primitive root mod p for infinitely many primes p. This is known 
to be true for all, but at most two, primes|| Gauss himself conjectured that 10 is a 
primitive root mod p for infinitely many primes p and this is also an open question. 
Any integer m, which is neither a perfect square nor —1, is conjectured to be a 
primitive root mod p for infinitely many primes p. 


Corollary 7.5.2. For every prime p and every integer k, we have 


a ee eo JO - opie 


Proof. Let S, := 1* + 2% +---+ (p—1)*. If p—1 divides k, then each j* = 1 
(mod p) by Fermat’s Little Theorem and so S, = 1+---+1=p-—1 (mod p), as 
claimed. So, henceforth assume that p — 1 does not divide k. 


Let a be a primitive root mod p, so that a* # 1 (mod p) since p— 1 does not 
divide k. The integers {a-1,a-2,...,a-(p—1)} form a reduced set of residues mod 
p and so are a rearrangement of the residues of {1,2,...,p—1} mod p. Therefore 
any symmetric function of these two sets of integers residues are congruent mod p 
(as we saw in the “Sets of reduced residues” proof of Fermat’s Little Theorem); in 


particular, 
p-l1 
Sp = Vien =a*S; (mod p). 
j=l 
Therefore (a* —1)S; =0 (mod p) but a* 41 (mod p) and so S$; =0 (mod p). 


5 This result is strangely formulated because of the nature of what was proved (by Heath-Brown 
{2], improving a result of Gupta and Murty, see [8])—that in any set of three distinct primes qi, q2, 43, 
at least one is a primitive root mod p for infinitely many primes p. Therefore there cannot be three 
exceptions to the conjecture, and we believe that there are none. 
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Near the beginning of this section we noted that if a is a primitive root (mod p), 
then every reduced residue is congruent to some power a (mod p). This property 
is extremely useful for it allows us to treat multiplication as addition of exponents 
in the same way that the introduction of logarithms simplifies usual multiplication. 
We will discuss this further in section 7.16 of appendix 7B. 


Exercise 7.5.1. Write each reduced residue mod p as a power of the primitive root a, and use 
this to evaluate 1* + 2*+---+(p—1)* (mod p) as a function of a and k. Use this to give another 
proof of Corollary [7.5.2 


Exercise 7.5.2. Let g be a primitive root modulo odd prime p. 
(a) Prove that g* = 1 (mod p) if and only if p — 1 divides a. 
(b) Show that g(?—1)/2 = —1 (mod p). 


In order to determine the order of an element mod n, one can use the following 
result: 


Proposition 7.5.1. Suppose that a and n are coprime integers. Then d is the 
order of a (mod n) if and only if af =1 (mod n) and a’/4 #1 (mod n) for every 
prime q dividing d. 


Proof. If d is the order of a (mod n), then a4 = 1 (mod n) and a*/4 #1 (mod n) 
by the definition of order, since d/q < d. 

On the other hand let m := ord,(a). By Lemma[7.1.2] we know that m divides 
d but does not divide d/q for any prime q dividing d. Therefore g does not divide 
d/m for any prime q dividing d, so there cannot be any primes q that divide d/m. 
This implies that d/m = 1 and so ord,(a) =m = d. 


We deduce an important practical way to recognize primitive roots mod p: 


Corollary 7.5.3. Suppose that p is a prime that does not divide integer a. Then 
a is a primitive root (mod p) if and only if 


a®-/1 41 (mod p) 


for all primes q dividing p—1. 


Proof. By definition a is a primitive root (mod p) if and only if m := ord,(a) = 
p—1. The result follows from Proposition [7.5.1] 


Exercise 7.5.3. Find all residues of order 5 mod 31, given that 2° = 1 (mod 31). 


Exercise 7.5.4. (a) Prove that 2 is a primitive root (mod 13). 
(b) Use this to determine all of the other primitive roots (mod 13). 


Exercise 7.5.5. Let g be a primitive root modulo odd prime p. 
(a) Prove that if m divides p— 1, then g™ has order p— 
(b)? Prove that g* (mod p) is a primitive root mod p if and only if (k,p— 1) = 1. 
(c) Deduce that there are $(p — 1) primitive roots mod p. 
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7.6. Testing for composites, pseudoprimes, and Carmichael numbers 


In the converse to Fermat’s Little Theorem, Corollary we saw that if an 
integer n does not divide a"~! — 1 for some integer a coprime to n, then n is 
composite. For example, taking a = 2 we calculate that 


21009 — 562 (mod 1001), 


so we know that 1001 is composite. We might ask whether this always works. In 
other words: 
Is it true that ifn is composite, then n does not divide 2” — 2? 


For, if so, we have a very nice way to distinguish primes from composites. Unfor- 
tunately the answer is “no” since, for example, 


PO =1 (mod 341), 
but 341 = 11 x 31. We call 341 a base-2 pseudoprime. Note though that 
3°49 = 56 (mod 341), 


and so the converse to Fermat’s Little Theorem, with a = 3, implies that 341 is 
composite. 

Are there composites n for which 2”~! = 3"-! = 1 (mod n)? Or 2" 1 = 
3-1 = 5"! = 1 (mod n)? Or, even Carmichael numbers, composite numbers 
that “masquerade” as primes in that a”~! = 1 (mod n) for all integers a coprime 
to n? A quick computer search finds the smallest example: 561 = 3-11-17. The 
next few Carmichael numbers are 1105 = 5-13-17, then 1729 = 7-13-19, ete. 


Exercise 7.6.1. Show that squarefree n is a Carmichael number if and only if n is composite 
and divides a” — a for all integers a. 


Carmichael numbers are a nuisance, masquerading as primes like this (and 
so preventing a quick and easy, surefire primality test). Calculations reveal that 
Carmichael numbers are rare, but in 1994 Alford, Pomerance, and I proved 
that there are infinitely many of them. Here is a more elegant way to recognize 
Carmichael numbers: 


Lemma 7.6.1. A positive integer n is a Carmichael number if and only if n is 
squarefree and composite and p—1 divides n—1 for every prime p dividing n. 


Proof. Suppose that n is squarefree and composite and p — 1 divides n — 1 for 
every prime p dividing n. If (a,n) = 1 and prime p divides n, then ord,(a) divides 
p—1 by Theorem [7.1] which divides n — 1, and so a”~! = 1 (mod p) by Lemma 
[7.1.2] Therefore a’~! = 1 (mod n) by the Chinese Remainder Theorem as n is 
squarefree, and so it is a Carmichael number. 


Now suppose that n is a Carmichael number. If prime p divides n, then a’~! = 


1 (mod p) for all integers a coprime to n. In particular, if a is a primitive root mod 
p, then p— 1 = ord,(a) divides n — 1 by Lemma[Z1.2] 

Now assume that p*||n with e > 2. We note that (1+ p)* =1+kp (mod p’?) 
for all integers k > 1, by the binomial theorem, so that ord,2(1 +p) = p. Select 
a =1+>p (mod p*) with a = 1 (mod n/p*) so that (a,n) = 1. As pln we have 
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1= (1+p)" =a" =a=1+>p (mod p’), a contradiction. Therefore n must be 
squarefree. 


Lemma [/.6.lJimples that 561 = 3-11-17 is a Carmichael number as 2,10, and 
16 divide 560. 


Exercise 7.6.2. Show that if n is a Carmichael number, then it is odd. 
Exercise 7.6.3.1 Show that if n is a Carmichael number, then it has at least three prime factors. 


Exercise 7.6.4. Prove that if 6m+1, 12m+1, and 18m +1 are all primes, then their product 
is a Carmichael number. (It is an open problem whether there exist infinitely many such prime 
triples, though it is not difficult to find examples, like 7 x 13 x 19 and 37 x 73 x 109.) 


7.7. Divisibility tests, again 


In section 2.4] we found simple tests for the divisibility of integers by 7, 9, 11, and 
13, promising to return to this theme later. The key to these earlier tests was 
that 10 = 1 (mod 9) and 10° = —1 (mod 7-11-18); that is, ordg(10) = 1 and 
ord7(10), ord,;(10), and ord;3(10) divide 6. For all primes p 4 2 or 5 we know that 
k := ord,(10) is an integer dividing p— 1. Hence if n = ey n, 107, then 


=. > ramet’) (10 (10*)” = 5° (x: ramet (mod p), 


m>0 m>0 

since if 7 = km +i, then 10’ = 10° (mod p). In the displayed equation we have 
cut up the integer n, written in decimal, into blocks of digits of length & and added 
these blocks together, which is clearly an efficient way to test for divisibility. The 
length of these blocks, k, is always < p—1 no matter what the size of n. Therefore 
we can, in practice, quickly test whether n is divisible by p, once we know the 
p-divisibility of every integer < 10* (< 10?~1). 

If k = 2¢ is even, we can do a little better (as we did with p = 7, 11, and 13) 
since 10° = —1 (mod p), namely that 


d 
n= S- njl = YS (x Nem+il0" — : Nkm-+e+il0" ) (mod p), 
j=0 


m>0 


thus breaking n up into blocks of length @ = k/2. 


7.8. The decimal expansion of fractions 


The fraction 3 + = .3333... is given by a recurring digit 3, so we write it as .3. More 
interesting i us are the: set of fractions 

1 2 3 

= = .142857, = = .285714, = = 428571 

7 ie si ; 

— 571428, Be .714285, : = .857142. 


These decimal expansions of the six fractions 7, 1 < a < 6, are each periodic of 
period length 6, and each contains the same six digits in the same order but starting 
at a different place. Starting with the period for 1/7 we find that we go through the 


fractions a/7 with a = 1,3,2,6,4,5 when we rotate the period one step at a time. 
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Do you recognize this sequence of numbers? These are the least positive residues 
of 10°, 10', 107, 10°, 104, 10° (mod 7). To prove this, we begin by noting that since 
10° = 1 (mod 7), we have that 


10° —1 


= 142857 is an integer, 


which is < 6 digits long. Putting the 1/7 on the other side and dividing through 
by 10°, we obtain 


1 142857 107° 1 1 
= = 142857 + —.- =. 
7 10° 7 a 10° 7 
Substituting this expression in for the last term, divided by 10°, we obtain 
1 142857 1 1 ne 
— = .142 (oS = 142 
857 + 108 27 857, 


the final equality by repeating this process infinitely often. Now if we multiply this 
through by 10, we obtain 
10 
en 
and similarly, as 10? = 2 (mod 7), 


2 1 10? cis = 
2 | = 28574. 


1.428571, so that - = 


7 7 7 
We obtain all the other decimal expansions analogously. 


What happens when we multiply 1/7 through by 10"? For example, if k = 4, 


then i 
= 1428.571428 = 1498 +. 


The part after the decimal point is always 10} which equals £ where @ is the 
least positive residue of 10" (mod 7) (as in exercise b)). We can now give two 


results. 


Proposition 7.8.1. Suppose that m is an integer that is coprime to 10. If1 < 
a<m, then the decimal expansion of the period for a/m is periodic with period of 
length ordm(10). This is the minimal period length if (a,m) = 1. 


Proof sketch. We proceed analogously to the above. Let n = ord,,(10) and 
r = (10" — 1)a/m, so that r is a positive integer < 10”. Let r be the sequence of 
digits that give the integer r. The same argument as above gives that 

a r,t a r r 1 a 

m 10" * 10" m 10" * 102" * 102" m 
On the other hand, if this equation holds and the decimal expansion has period n, 
then (10" — 1l)a/m = (10" — 1).% =r. —.F =r. In other words, (10” — 1)a/m is 
the integer r, so that 10” = 1 (mod m) if (a,m) = 1. 


=...=fF, 


Exercise 7.8.1.1 Suppose that p is an odd prime for which 10 is a primitive root. Let a, be the 
least residue of 10" (mod p), and suppose that az,/p = .7_ where 1 < rp, < 10?—!. Prove that rz 
is obtained from ri, by removing the leading k digits and concatenating them on to the end. 


Exercise 7.8.2. Prove that the decimal expansion of every rational number is eventually periodic. 
(One can see why we need “eventually” with the example oa = .03333....) 
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7.9. Primes in arithmetic progressions, revisited 


We can use the ideas in this chapter to prove that there are infinitely many primes 
in certain arithmetic progressions 1 (mod m). 


Theorem 7.7. There are infinitely many primes =1 (mod 3). 


Proof. Suppose there are only finitely many primes = 1 (mod 3), say pj, p2,..., Dr- 
Let a = 3p,po---pz, and let q be a prime dividing a? +a+1. Now q 4 3 as 
a? +a+1=1 (mod 3). Moreover g divides a? — 1 = (a — 1)(a? +a +1), but 
not a—1 (or else 0= a? +a+1=1+1+1=3 (mod gq) but g 43). Therefore 
ord,(a) = 3 and so q = 1 (mod 3) by Theorem Hence gq = p; for some j, so 
that q divides a as well as a +a+1, and thus q divides (a? +a+1)—a(a+1) =1, 
which is impossible. 


This, together with Theorem proves that there are infinitely many primes 
in both of the residue classes 1 (mod 3) and 2 (mod 3), as predicted from the data 
at the start of section 5.3] 


Exercise 7.9.1. Generalize this argument to primes that are 1 (mod 4), to primes that are 1 
(mod 5), and to primes that are 1 (mod 6). 


In order to generalize this argument to proving the existence of primes = 1 
(mod m) for every integer m > 3, including composite m, we need to replace the 
polynomial a? + a+1 by one that recognizes when a has order m. Evidently this 
must be a divisor of the polynomial a” — 1; indeed a” — 1 divided through by 
all of the factors corresponding to orders which are proper divisors of m. This 
discussion leads us to define the cyclotomic polynomials ¢,(t) € Z|t], inductively, 
by the requirement 


(7.9.1) t™-1= [|] ¢a(t)  forallm>1, 

d|m 
with each da(t) monic (see also appendix 4E). Therefore ¢)(t) =t — 1, 
do(t) =t+1, d3(t) = +£41, a(t) =#7 41, o5(t) =A ++ 4441,.... 
Theorem 7.8. For any integer m > 2, there are infinitely many primes = 1 
(mod m). 


Proof. Suppose that pi,...,pz are all the primes that are = 1 (mod m) and let 
a= mp,--+-pr. Let q be a prime divisor of ¢,,(a), which divides a™ — 1, so that 
a™ = 1 (mod q). This implies that (q,a) divides (a” — 1,a) = 1 and so (q,a) = 1. 
In particular q is not a p; and does not divide m. 

Let d= ord,(a) so that g= 1 (mod d) by Theorem[7.]] Moreover d divides m 
as a™ = 1 (mod q). But q is not a p; and so gq # 1 (mod m), which implies that 
d#m, and therefore d < m. 

Now ¢m(x) divides a by definition. Substituting in x = a we deduce that 


q divides both 47=+ and a — 1, so that 


m/d—-1 m/d-1 
a”™—1 


0= or in S- (a?)J = » 1l=m/d (mod q). 
j=0 


g=0 
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This implies that q divides m/d, and therefore divides m, which contradicts what 
we proved above. 
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Additional exercises 


Exercise 7.10.1. Prove that we can write any polynomial f(x) mod p of degree < p—1 as 


p-1 


fe) = SS F@O—(e-a)?) (mod p). 
a=0 


Exercise 7.10.2.1 Prove that if f(x) € Z[x] is monic and has degree d and if prime p divides 
f(0), f(4),..., f(d), then p < d and p divides f(n) for all integers n. 


Exercise 7.10.3. We will find all powers of 2 and 3 that differ by 1, a special case of Catalan’s 
conjecture mentioned in section 
(a) What are the powers of 3 (mod 8)? What are the powers of 2 (mod 8)? 
(b) Show that if 2” — 3” =1 (mod 8) for some positive integers m,n, then n = 1 or 2. 
(c) Deduce that the only solutions to 2" — 3" = 1 are4—3=2-1=1. 
(d) 
) 


d) Prove that if 3’ — 2” = 1 with m odd, then m=n=1. 
(e) Prove that if 32* 2” = 1, then both 3* — 1 and 3* +1 are powers of 2, and that this is only 
possible if k = 1. We deduce that the only solutions to 3 — 2” = 1 are3-—2=9-8=1. 


(This is the proof of Levi ben Gershon from around 1320.) 


Exercise 7.10.4.' Show that if (3) with n > 3 has no more than one prime factor which is > 3, 
then n = 3, 4, 5, 6, 8, 9, 10, or 18. (Use exercise [7.10.3] ) 


Exercise 7.10.5. (a) Prove that if a > 1, then the order of a mod N := a4 — 1 is exactly gq. 
Now let q be a prime. 
(b) Deduce that if prime p divides a? — 1 but not a— 1, then p is a prime = 1 (mod q). 
(c) Prove that (Maha 1) =(q,a-1). 


(d)t Prove that there are infinitely many primes = 1 (mod q). 


Exercise 7.10.6. Let p be an odd prime, and let x, y, and z be pairwise coprime, positive 
integers. 
zP—yP 
z-y 
(b) Show that if x? + y? = z?, then there exists an integer r for which z— y= r? or z—y= 
=Lup 
pe-rrP. 


(This problem continues on from exercise [3.9.7]) 


(a)? Prove that if p divides z — y, then =p (mod p?). 


Exercise 7.10.7. Deduce Theorem[7.6]from (7.5.1) using the Mébius inversion formula (Theorem 
4.4). 


Exercise 7.10.8. Let p bea prime. Prove that every quadratic non-residue (mod p) is a primitive 
root if and only if p is a Fermat prime. 


Exercise 7.10.9. Suppose that g is a primitive root modulo odd prime p. Prove that —g is also 
a primitive root mod p if and only if p= 1 (mod 4). 
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Exercise 7.10.10. (a) Show that the number of primes up to N equals, exactly, 
3 n { (n—1)! \ 2 

2<n<N n-1 n 3 

(Here {t} is the fractional part of t, defined as in exercise[I.7.4{b).) 


(b) Suppose that n > 1. Show that n and n+ 2 are both odd primes if and only if n(n + 2) 
divides 4((n — 1)!+1) +n. 


Exercise 7.10.11. Prove that if f(a) € Z[a] has degree < p — 2, then a f(a) = 0 (mod p). 


Exercise 7.10.12.' Let p be an odd prime and k be an odd integer which is # 1 (mod p — 1). 
Prove that 1* + 2* +---+(p—1)* =0 (mod p?). 


Exercise 7.10.13.' Let Qn+1 = 2an + 1 for all n > 0. Can we choose ag so that this sequence 
consists entirely of primes? 


We define n to be a base-b pseudoprime if n is composite and b"~! = 1 (mod n). 


Exercise 7.10.14. Show that if n is not prime, then it a base-b pseudoprime if and only if 
ord, (b) divides n — 1 for every prime power p* dividing n. 


Exercise 7.10.15. Suppose that n is a squarefree, composite integer. 
(a) Show that #{a (mod p) : a"~-! =1 (mod p)} = (p—1,n—1). 
(b) Show that there are ]],,,,,(p — 1,n — 1) reduced residue classes b (mod n) for which n is a 
base-b pseudoprime. 


Exercise 7.10.16. (a) Prove that if n is composite, then {b (mod n) : n is a base-b pseudo- 
prime} is a subgroup of the reduced residues mod n. 
(b)* Prove that if n is not a Carmichael number, then it is not a base-b pseudoprime for at least 
half of the reduced residues b (mod n). 
(c)t Suppose that p and 2p — 1 are both prime and let n = p(2p — 1). Prove that 


1 
#{b (mod n): n is a base-b pseudoprime} = ght”): 


Exercise 7.10.17. (a) Show that if p is prime, then the Mersenne number 2? — 1 is either a 
prime or a base-2 pseudoprime. 
(b) Show that every Fermat number 22” +41 is either a prime or a base-2 pseudoprime. 
(c) Show that p? divides 2?-! — 1 if and only if p? is a base-2 pseudoprime. 


None of these criteria guarantee that there are infinitely many base-2 pseudo- 
primes. However this is provable: 


Exercise 7.10.18.' Prove that there are infinitely many base-2 pseudoprimes by proving and 
developing one of the following two observations: 


e Start with 341, and show that if n is a base-2 pseudoprime, then so is N := 2” — 1. 
e Prove that if p > 3 is prime, then (4? — 1)/3 is a base-2 pseudoprime. 


Can you generalize either of these proofs to other bases? 


Exercise 7.10.19. Let a,b,c be pairwise coprime positive integers. Prove that there exists a 
(unique) residue class mo (mod abc) such that if m = mo (mod abc) and if am +1, bm + 1, and 
cm-+1 are all primes, then their product is a Carmichael number (for example, a = 1,b = 2,c=3 
in exercise with mo = 0). 


Exercise 7.10.20. Let D be a finite set of at least two distinct positive integers, the elements of 
which sum to n. Suppose that d divides n for every d € D. Prove that if there exists an integer 
m for which pa := dnm + 1 is prime for every d € D, then []g¢p pa is a Carmichael number. (In 
particular note the case in which n is perfect and D is the set of proper divisors of n. The perfect 
number 6, for example, gives rise to the triple 6m + 1,12m + 1,18m+1, which we explored in 
exercise [7.6.4] ) 
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Exercise 7.10.21. (a) Prove that .010010000100... is irrational. (Here we put a “1” two digits 
after the decimal point, then 3 digits later, then 5 digits later, etc., with all the other digits 
being 0, the spacings between the “1”’s being p — 1 for each consecutive prime p.) 

(b)* Develop this idea to find a large class of irrationals. 


Appendix 7A. Card shuffling 
and Fermat’s Little Theorem 


In this appendix we will define order in terms of card shuffling, give a combinatorial 
proof of Fermat’s Little Theorem, and discuss quick calculations of powers mod n. 


7.11. Card shuffling and orders modulo n 


The cards in a 52-card deck can be arranged in 52! ~ 8 x 10°” different orders. Be- 
tween card games we shuffle the cards to make the order of the cards unpredictable. 
But what if someone can shuffle “perfectly”? How unpredictable will the order of 
the cards then be? Let’s analyze this by carefully figuring out what happens in a 
“perfect shuffle”. There are several ways of shuffling cards, the most common being 
the riffle shuffle. In a riffle shuffle one splits the deck in two, places the two halves 
in either hand, and then drops the cards, using one’s thumbs, in order to more or 
less interlace the cards from the two decks. 


One begins with a deck of 52 cards and, to facilitate our discussion, we will 
call the top card, card 1, the next card down, card 2, etc. If one performs a perfect 
riffle shuffle, one cuts the cards into two 26 card halves, one half with the cards 
1 through 26, the other half with the cards 27 through 52. An “out-shuffle” then 
interlaces the two halves so that the new order of the cards becomes (from the top) 
cards 


1,27, 2, 28, 3,29, 4, 30,.... 


That is, cards 1,2,3,...,26 go to positions 1,3,5,...,51, and cards 27, 28,...,52 
go to positions 2,4,...,52, respectively. We can give formulas for each half: 


1, P2R-1 for <k< 26, 
2k—52 for 27<k<52. 
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These coalesce into one formula & + 2k — 1 (mod 51) for all k,1 < k < 52. The 
top and bottom cards do not move, that is, 1 — 1 and 52 — 52, so we focus on 
understanding the permutation of the other fifty cards: 


Any shuffle induces a permutation o on {1,..., 52} (9 For the out-shuffle, o(1) = 
1,0(52) = 52, and 


a(1+™m) is the least positive residue of 1+ 2m (mod 51) for 1 <m< 50. 


To determine what happens after two or more out-shuffles, we simply compute the 
function o*(.) (= a(a(...a(.)))). Evidently o*(1) = 1,0*(52) = 52, and then 
> 


k times 
o*(1+m) is the least positive residue of 1+ 2*m (mod 51) for 1 <m< 50. 


Now 2° = 1 (mod 51), and so o8(1+m) = 1+ m (mod 51) for all m. Therefore 
eight perfect out-shuffles return the deck to its original state—so much for the 52! 
possible orderings! 


Eight more perfect out-shuffles will also return the deck to its original state, a 
total of 16 perfect out-shuffles, and also 24 or 32 or 40, etc. Indeed any multiple of 
8. So we see that the order of 2 (mod 51) is 8 and that 2" = 1 (mod 51) if and only 
if r is divisible by 8. This shows, we hope, why the notion of order is interesting 
and exhibits one of the key results (Lemma[7.1.2) about orders. 


Exercise 7.11.1.1 An “in-shuffle” is the riffle shuffle that interlaces the cards the other way; 
that is, after one shuffle, the order becomes cards 27,1, 28,2,29,...,52,26. Analyze this in an 
analogous way to the above, and determine how many “in-shuffles” it takes to get the cards back 
into their original order. 


Exercise 7.11.2.' What happens when one performs riffle shuffles on n-card decks, with n even? 


Exercise 7.11.3.4 Suppose that the dealer alternates between in-shuffles and out-shuffles. How 
many such pairs of shuffles does it take to get the deck of cards back into their original order? 


Persi Diaconis is a Stanford mathematics professor who left home at the age of 
fourteen to learn from sleight-of-hand legend Dai Vernon[ It is said that Diaconis 
can shuffle to obtain any permutation of a deck of playing cards. We are interested 
in the highest possible order of a shuffle. To analyze this question, remember that 
a shuffle can be reinterpreted as a permutation o on {1,...,n} (where n = 52 for a 
usual deck). One way to explicitly write down a permutation is to track the orbit 
of each number. For example, for the permutation o on 5 elements given by 


o(1) =4, o(2) =5, 0(8) =1, o(4) = 3, o(5) = 2, 
1 gets mapped to 4, which gets mapped to 3, and 3 gets mapped back to 1, whereas 
2 gets mapped to 5 and 5 gets mapped back to 2, so we can write 
o = (1,4,3)(2,5). 


Each of (1, 4,3) and (2,5) is a cycle, and cycles cannot be decomposed any further. 
Any permutation can be decomposed into cycles in a unique way, the analogy of 
the Fundamental Theorem of Arithmetic, for permutations. What is the order of 
a? Now o” = (1,4,3)"(2,5)”, so that o”(1) = 1, o”(4) = 4, and o”(3) = 3 if 


®That is, o : {1,...,52} > {1,...,52} such that the o(i) are all distinct (and so o has an inverse). 
“Check out this story, and these larger-than-life characters, on Wikipedia. 
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and only if 3 divides n, while 0” (2) = 2 and o”(5) = 5 if and only if 2 divides n. 
Therefore o” is the identity if and only if 6 divides n, and so o has order 6. 


Exercise 7.11.4. Suppose that o is a permutation on {1,...,n} and that o = Ci ---C, where 
Ci,...,C are disjoint cycles. 
(a) Show that the order of o equals the least common multiple of the lengths of the cycles 
Cj, 1<j<k. 
(b) Use this to find the order of the permutation corresponding to an out-shuffle. 
(c) Prove that if ni,...,m% are any set of positive integers for which nj +--- +n, =n, then 
there exists a permutation ¢ = C1 ---C, on {1,...,n}, where each Cj has length nj. 
(d) Deduce that the maximum order, m(n), of a permutation o on {1,...,n} is given by 
max Icm[ni,..., x] over all integers n1,...,nx > 1 for which nj +-+-+ng =n. 


Our goal is to determine m(52), the highest order of any shuffle that Diaconis 
can perform on a regular deck of 52 playing cards. However it is unclear how to 
determine m(n) systematically. Working through the possibilities for small n, using 
exercise [7.11.4] we find that 


m(5) = 6 obtained from 6 = 3-2 and 5 = 342, 
m(6) = 6 obtained from 6 = 3-2-1 and 6 = 34241, 
m(7) = 12 obtained from 12 = 4-3 and 7 = 4+3, 
m(8) = 12 obtained from 12 = 4-3-1 and 8 = 44341, 
m(9) = 20 obtained from 20 = 5-4 and 9 = 5+4, 
m(10) = 30 obtained from 30 = 5-3-2 and 10 = 54342, 
m(11) = 30 obtained from 30 = 6-5 and 11 = 6+5, 
m(12) = 60 obtained from 60 = 5-4-3 and 12 = 5+4+4+3. 


No obvious pattern jumps out (at least to the author) from this data, though one 
observes one technical issue: 


Exercise 7.11.5.' Show that there is a permutation o = C,---Cy on {1,...,n} of order m(n) 
in which the length of each cycle is either 1 or a power of a distinct prime. 


Exercise 7.11.6.' Use the previous exercise to determine m(52). 


Exercise 7.11.7. Use exercise to prove that logm(n) ~ /nlogn. 


7.12. The “necklace proof” of Fermat’s Little Theorem 


Little Sophie has a necklace-making kit, which comes with wires that each accom- 
modate p beads, and unlimited supplies of beads of a different colors. How many 
genuinely different necklaces can be Sophie make? Two necklaces are equivalent if 
they can be obtained from each other by a rotation; otherwise they are different; 
and so Sophie is asking for the number of equivalence classes of sequences of length 
p where each entry is selected from a possible colors. 


Suppose we have a necklace with the jth bead having color c(j) for 1 <j < p. 
One can rotate the necklace in p different ways: If we rotate the necklace k places 
for some k in the range 0 < k < p—1, then the jth bead will have color c(j +k) for 
1 <j <p, where c(.) is taken to be a function of period p. If two of these equivalent 
necklaces are identical, then c(j +k) = c(j+8£) for all 7, for some 0 <k << p-l. 
Then c(n + d) = c(n) for all n, where d = €—k € [1,p — 1], and so c(md) = c(0) 
for all m; that is, all of the beads in the necklace have the same color. 
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Therefore we have proved that, other than the a necklaces made of beads of the 
same color which each belong to an equivalence class of size 1, all other necklaces 
belong to equivalence classes of size p. Since there are a? possible sequences of 
length p with a possible colors for each entry, and a sequences that all have the 
same color, the total number of equivalence classes (different necklaces) is 
aP—a 

o 
In particular, we have established that p divides a? — a for all a, as desired} 


a+ 


Exercise 7.12.1. Let p be prime. Let X denote a finite set and f : X — X where f? =i, the 
identity map. (Here f? means composing f with itself p times.) Let Xgxeq := {a € X : f(x) = a}. 
(a) Prove that |X| = |Xgxea| (mod p). 
Let G be a finite multiplicative group and X = {(a1,...,%p) € GP: #1 ---ap) = 1}. 
(b)* Deduce that #{g € G: g has order p} = |G|P—! — 1 (mod p). 
(c) Deduce that if p divides the order of finite group G, then G contains an element of order p. 
Combined with Lagrange’s Theorem, Corollary [7.23.1] of appendix 7D, this is an “if and 
only if” criterion. 


Exercise 7.12.2. Let p be a given prime. 
(a) Use of appendix 4C to determine the number of irreducible polynomials mod p of 
prime degree q. 
(b) Deduce that g? = q (mod p) for every prime gq. 
(c) Deduce Fermat’s Little Theorem. 


More combinatorics and number theory 


[1] Melvin Hausner, Applications of a simple counting technique, Amer. Math. Monthly 90 (1983), 
127-129. 


7.13. Taking powers efficiently 


How can one raise a residue class mod m to the nth power “quickly”, when n is 
very large? In 1785 Legendre computed high powers mod p by fast exponentiation: 
To determine 5°° (mod 161), we write 65 in base 2, that is, 65 = 2° + 2, so 
that 5% = 52°. 52’. Let fo = 5 and f, = f2 = 5? = 25 (mod 161). Next let 
fo = fo = f? = 25? 142 (mod 161), and then fs = fe = fz 142? = 39 
(mod 161). We continue computing f, = f? = f2_, (mod 161) by successive 
squaring: f, = 72, fs = 32, fe = 58 (mod 161) and so 5© = 5°*1 = f,- fo = 
58-5 = 129 (mod 161). We have determined the value of 5°° (mod 161) in seven 
multiplications mod 161, as opposed to 64 multiplications by the more obvious 
algorithm. 


In general to compute a” (mod m) quickly: Define 


fo =a and then f; = Spa (mod m) for j = 1,2,...,91, 


where j; is the largest integer for which 2/1 < n. Writing n in binary, say as 
n = 294 Q)2 4... 4 Qie with 7, > jg > --: > je = O, let gi = fj, and then 


5 We've seen that Fermat’s Little Theorem arises in many different contexts. Even its earliest 
discoverers got there for different reasons: Fermat, Euler, and Lagrange were led to Fermat’s Little 
Theorem by the search for perfect numbers, whereas Gauss was led to it by studying the periods in the 
decimal expansion of fractions (as in section[7.8). It seems to be a universal truth, rather than simply 
an ad hoc discovery. 
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9 = gi-1f;, mod m for i = 2,3,...,¢. Therefore 


R= Fi . Fiz ss Fie = qo eee = a” (mod m). 


This involves jg +€—1 < 27, < =e multiplications mod m as opposed to n 


multiplications mod m by the more obvious algorithm. 
One can often use fewer multiplications. For example, for 31 = 1+2+4+8+16 
the above uses 8 multiplications, but we can use just 7 multiplications if, instead, 


we determine a®! (mod m) by computing a? = a-a; a? = a?-a; a® =a3-a?; a= 


a® .a®; a4 = q'?. al; a39 = a*4-a® (mod m); and finally a?! = a®°-a@ (mod m). 
These exponents form an addition chain, a sequence of integers ej = 1 < 
€g < +++ < ex where, for all k > 1, we have ex, = e; + e; for some i,j € 


{1,...,4 — 1}. In the example above, the binary digits of 31 led to the addi- 
tion chain 1, 2,3,4,7,8,15,16,31, but the addition chain 1, 2, 3,6, 12,24, 30,31 is 
shorter. 

For most exponents n, there is an addition chain which is substantially shorter 
than je+/—1, though never less than half that size. There are many open questions 
about addition chains. The best known is Scholz’s conjecture that the shortest 
addition chain for 2” —1 has length < n—1 plus the length of the shortest addition 
chain for n. For much more on addition chains, see Knuth’s classic book [Knu98}. 


7.14. Running time: The desirability of polynomial time algorithms 


In this section we discuss how to measure how fast an algorithm is. The inputs into 
the algorithm in the previous section for calculating a” (mod m) are the integers 
a and m, with 1 < a < m, and the exponent n. We will suppose that m has d 
digits (so that d is proportional to logm). The usual algorithms for adding and 
subtracting integers with d digits take about 2d steps, whereas the usual algorithm 
for multiplication takes about d? steps) 


Exercise 7.14.1. Justify that multiplying two residues mod m together and reducing mod m 
takes no more than 2d? steps. 


The algorithm described in the previous section involves about clog n multipli- 
cations of two residues mod m, for some constant c > 0, and so the total number 
of steps is proportional to 

(logm)? logn. 


Is this good? Given any mathematical problem, the cost (measured by the number 
of steps) of an algorithm to resolve the question must include the time taken to read 
the input data, which can be measured by the number of digits, D, in the input. 
In this case the input is the numbers a, m, and n, so that D is proportional to 
logm-+logn. Now if a and m are fixed and we allow n to grow, then the algorithm 
takes CD steps for some constant C' > 0, which is C times as long as it takes to 
read the input. You cannot hope to do much better than that. On the other hand, 
if m and n are roughly the same size, then the algorithm takes time proportional 


°Since we have to multiply each pair of digits together, one from each of the given numbers. 
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to D®. We still regard this as fast—any algorithm whose speed is bounded by a 
polynomial in D is a polynomial time algorithm and is considered to be pretty fast. 


It is important to distinguish between a mathematical problem and an algo- 
rithm used for resolving it. There can be many choices of algorithm and one wants 
a fast one. However, we might only know a slowish algorithm which, even though 
it may seem clever, does not necessarily mean that there is no fast algorithm. 


Let P be the class of problems that can be resolved by an algorithm that runs 
in polynomial time. Few mathematical problems belong to P and the key question 
is whether we can identify which problems. We’ll discuss P in section [10.4] 


Exercise 7.14.2. Prove that the Euclidean algorithm works in polynomial time. 


Appendices. The extended version of chapter 7 has the following additional 
appendices: 


Appendix 7B. Orders and primitive roots discusses how the order mod p* of 
an integer prime to p varies as k increases. As a consequence we determine the 
structure of (Z/p*Z)* and calculate orders modulo composite m. We go on to 
discuss Gauss’s extraordinary algorithm to construct primitive roots mod p, which 
works even better in the computer age than it did in his time. 


Appendix 7C. Finding nth roots modulo prime powers introduces the question 
of explicitly determining all of the nth roots mod p, of a given nth power. Using 
the ideas in appendix 7B we can efficiently find all nth roots of 1 mod p, so our 
question boils down to finding one mth root, where m = (n,p — 1). We use this to 
find the nth roots of a (mod p*) for increasing k. We finish by looking at special 
cases in which one can find nth roots mod p through a formula (this is not always 
the case), which works if n divides p — 1 and (n, poh) =, 

Appendix 7D. Orders for finite groups. Here we generalize the concept of order 
and Fermat’s Little Theorem to arbitrary finite groups, and Wilson’s Theorem if 
the group is also commutative. Finally we look at normal subgroups and develop 
the analogy of the Fundamental Theorem of Arithmetic, for finite groups. 


Appendix 7E. Constructing finite fields. We show that all finite fields have 
order p” for some prime p and integer r > 1 and show how to construct them. 
Moreover we find two different generalizations of (74-1). 


Appendix 7F. Sophie Germain andFermat’s Last Theorem proves Sophie Ger- 
main’s famous result, which substantially restricts possible solutions to Fermat’s 
Last Theorem with exponent p, when p and 2p + 1 are both primes. 


Appendix 7G. Primes of the form 2" + k shows how to construct integers k 
such that 2” +k is never prime, using a surprisingly simple idea of Paul Erdés. We 
then go on to extend this idea to show there are integers / for which there are no 
primes of the form F,, + k as F,, ranges through the Fibonacci numbers. 


Appendix 7H. Further congruences. Here we study Fermat quotients and in 
particular whether p? ever divides 2? — 2. We also look at. binomial coefficients mod 
p*, Bernoulli numbers mod p, the Wilson quotient, sums of powers of integers mod 
p*, and go beyond Fermat’s Little Theorem. 
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Appendix 7I. Primitive prime factors of recurrence sequences are prime factors 
of a particular term of the recurrence sequence that divide no earlier term. For 
certain recurrence sequences, we show that every term, except perhaps the first 
few, has a primitive prime factor, and we discuss what is known on this subject. 


Chapter 8 


Quadratic residues 


In this chapter we will develop an understanding of the squares mod n, in particular 
how many there are and how to quickly identify whether a given residue is a square 
mod n. We mostly discuss the squares modulo primes and from there understand 
the squares mod prime powers via “lifting”, and modulo composites through the 
Chinese Remainder Theorem. 


8.1. Squares modulo prime p 


There are two types of squares mod p. We always have 0? = 0 (mod p). Then 
there are the “quadratic residues (mod p)”, which are the non-zero residues a 
(mod p) which are congruent to a square modulo p. All other residue classes are 
“quadratic non-residues”. If there is no ambiguity, we simply say “residues” and 
“non-residues”. In the next table we list the quadratic residues modulo each of the 
primes between 5 and 17. 


Modulus Quadratic residues 
5 1,4 
7 1, 2,4 
11 1, 3, 4, 5,9 
13 1, 3, 4, 9, 10, 12 
17 1, 2, 4, 8, 9, 13, 15, 16 


Exercise 8.1.1. (a) Prove that 337 is not a square (that is, the square of an integer) by 
reducing it mod 5. 
(b) Prove that 391 is not a square by reducing it mod 7. 
(c) Prove that there do not exist integers x and y for which x? — 3y? = —1, by reducing any 
solution mod 3. 


In each row of our table there seem to be B+ quadratic residues mod p: 


Lemma 8.1.1. The distinct quadratic residues mod p are given by 17,27,..., ()* 
(mod p). 


Proof. If r? = s* (mod p) with 1 < s<r<p-—1, then p| r?—s? =(r—s)(r+s) 
and so p divides either r— s or r+s. Now 0 <r—s <p and so p does not divide 
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r —s. Therefore p divides r+ s, and 0 <r-+s < 2p, so we must have r+ s = p. 


Hence the residues of 17, 2?,..., (51)° (mod p) are distinct, and if s = p—r, then 


s? =(-r)? =r? (mod p). This implies our result. 


Define the Legendre symbol as follows: For each odd prime p let 
0 ifa@=0 (mod p), 
a 
(<) = 1 if ais a quadratic residue mod p, 
—1 if ais a quadratic non-residue mod p. 


Exercise 8.1.2. (a) Prove that if a= b (mod p), then (s) — (2). 


Pp 
-1 
(b) Prove that S>?75 (2) = 0. 


Corollary 8.1.1. There are exactly 1+ (3) residues classes b (mod p) for which 
b? =a (mod p). 

Proof. If a is a quadratic non-residue, there are no solutions. For a = 0 if b? = 0 
(mod p), then b = 0 (mod p) so there is just one solution. If a is a quadratic 
residue, then, by definition, there exists b such that b? = a (mod p), and then there 
are the two solutions (p — b)? = b? = a (mod p) and no others, by the proof in 
LemmaJ8.1.1] (or by Proposition [7.4.1). We have therefore proved 


1 ifa=0 (mod p), 
#{b (mod p): b? =a (mod p)}= 42. if ais a quadratic residue mod p, 


0 if ais a quadratic non-residue mod p. 


This equals 1 + (), looking above at the definition of the Legendre symbol. 
Theorem 8.1. We have (#) = (3) (4) for any integers a,b. That is: 
(i) The product of two quadratic residues (mod p) is a quadratic residue. 
(ii) The product of a quadratic residue and a non-residue is itself a non-residue. 


(iii) The product of two quadratic non-residues (mod p) is a quadratic residue. 


Proof (Gauss). (i) If a= A? and b= B?, then ab = (AB)? (mod p). 

Let R := {r (mod p) : (r/p) = 1} be the set of quadratic residues mod p. We 
saw that if (a/p) = 1, then (ar/p) = 1 for all r € R. In other words, ar € R; that 
is, aR C R. The elements of aR are distinct, so that |aR| = |R|, and therefore 
aR = R. 

(ii) Let N = {n (mod p) : (n/p) = —1} be the set of quadratic non-residues 
mod p, so that NU R partitions the reduced residues mod p. By exercise [3.5.2 
we deduce that aR UdaWN also partitions the reduced residues mod p, and therefore 
aN = N since aR = R. That is, the elements of the set {an : (n/p) = —1} are all 
quadratic non-residues mod p. 


By Lemma[.L1] we know that |R| = 25+, and hence |N| = 25+ since NUR 
partitions the p — 1 reduced residues mod p. 
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(iii) In (ii) we saw that if (n/p) = —1 and (a/p) = 1, then (na/p) = —1. Hence 
nR CN and, as |nR| = |R| = po = |N|, we deduce that nR = N. But nRUnN 
partitions the reduced residues mod p, and sonN = R. That is, the elements of 
the set {nb: (b/p) = —1} are all quadratic residues mod p. 


Exercise 8.1.3. Suppose that prime p does not divide ab. 
(a) Prove that (24) = (2). 
(b) Prove that there are non-zero residues x and y (mod p) for which ax? + by? = 0 (mod p) if 


and only if (=2") = 1, 


Exercise 8.1.4. Prove that if odd prime p divides b? — 4ac but neither a nor c, then (2) = (¢). 
Exercise 8.1.5. Let p be a prime > 3. Prove that if there is no residue « (mod p) for which 


x? = 2 (mod p), and no residue y (mod p) for which y? = 3 (mod p), then there is a residue z 
(mod p) for which z? = 6 (mod p). 


We deduce from Theorem[8. I]that ( : ) is a multiplicative function. Therefore if 


P 
we have a factorization of a into prime factors as a = +q}'q5? ...q;", and (a, p) = 1, 
ther] 
a 4+1\ 7° a \" ae] : di 
— i x= a 
(5) ee a) LG): 
e; odd 
e 


since (q/p)? = 1 whenever p { q as this implies that («) "= 1 ife; is even, and 


(«) = («) if e; is odd. Therefore, in order to determine (s) for all integers a, 


it is only necessary to know the values of (=), and of (2) for all primes q. 


Exercise 8.1.6. One can write each non-zero residue mod p as a power of a primitive root g. 
(a) Prove that the quadratic residues are precisely those residues that are an even power of g, 
and the quadratic non-residues are those that are an odd power. 


(b) Deduce that (2) =—l. 


Exercise 8.1.7. (a) Show that if n is odd and p divides a” — 1, then (2) = 1: 
(b) Show that if n is prime and p divides a” — 1, but a#1 (mod p), then p=1 (mod n). 
(c) Give an example to show that (b) can be false if we only assume that n is odd. 


Exercise 8.1.8. (a) Prove that, for every prime p # 2,5, at least one of 2, 5, and 10 is a 
quadratic residue mod p. 
(b)t Prove that, for every prime p > 5, there are two consecutive positive integers that are both 
quadratic residues mod p and are both < 10. 


8.2. The quadratic character of a residue 


Fermat’s Little Theorem (Theorem [7.2) states that the (p — 1)st power of any 
reduced residue mod p is congruent to 1 (mod p). Are there other patterns to be 
found among the lower powers? 


1Bach of “4” and “+1” is to be read as “either ‘+’ or ‘—’”. We deal with these two cases together 
since the proofs are entirely analogous, taking care throughout to be consistent with the choice of sign. 
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a a? a at a? a® 
a a’ a® at 1 1 1 1 1 1 
1 1 l l 2 -3 1 2 -3 1 
9 1 9 1 3 2 -1 -3 -2 1 
2 1 9 l -3 2 1 -3 2 1 
| 1 4 1 -2 -3 -1 2 3 1 

-1 1 -1 1 -1 1 

The powers of a mod 5 The powers of a mod 7 


As expected the (p — 1)st column is all 1’s, but there is another pattern that 
emerges: The entries in the “middle” column, that is, the a? column mod 5 and the 
a® column mod 7, are all —1’s and 1’s. This column represents the least residues of 
numbers of the form a”2~ (mod p), and it appears that these are all —1’s and 1’s. 
Can we decide which are +1 and which are —1? For p = 5 we see that 17 = 47 = 1 
(mod 5) and 2? = 3? = —1 (mod 5); recall that 1 and 4 are the quadratic residues 
mod 5. For p = 7 we see that 13 = 23 = 43 = 1 (mod 7) and 33 = 5° = 6? = -1 
(mod 7); recall that 1, 2, and 4 are the quadratic residues mod 7. So we have 
observed a pattern: The ath entry in the middle column is +1 if a is a quadratic 
residues mod p, and it is —1 if a is a quadratic residues mod p; in either case it 


equals the value of the Legendre symbol, (z). This observation was proved by 
Euler in 1732. 


Theorem 8.2 (Euler’s criterion). We have gr = (s) (mod p) for all primes p 


and integers a. 


Proof #1. If (3) = 1, then there exists b such that b? = a (mod p) so that 
az =pP-t=1 (mod p), by Fermat’s Little Theorem. 

If (4) = —1, then we proceed as in Gauss’s proof of Wilson’s Theorem though 
pairing up the residues slightly differently. Let 

S={(r,s): 1<r<s<p-—1l1, rs=a (mod p)}. 

Note that if rs = a (mod p), then r # s (mod p), or else a = r? (mod p), contra- 
dicting that (2) =-—1. Therefore each integer m, 1 < m < p—1, appears exactly 
once, in exactly one pair in S. We deduce that 


(p-1)!= II rs = all =a*e (mod p), 
(r,s)ES 


and the result follows from Wilson’s Theorem. 


For example, for p = 13,a = 2 we have 
—1=12! = (1- 2)(3-5)(4-7)(6-9)(8-10)(11-12) = 2° (mod 13). 


Exercise 8.2.1.' Prove Euler’s criterion for (a/p) = 1, by evaluating (p — 1)! (mod p) as in the 


second part of proof #1, but now taking account of the solutions r (mod p) to r? = a (mod p). 
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Proof #2 of Euler’s criterion. We began Proof #1 by showing that if (s) =i, 
then a°= =1= (s) (mod p). This = that a is a root of «°F —1 (mod p). 


By LemmaJ8.1.I]there are oe aes 1 quadratic residues mod p, and we now know 
that these are all roots of «°= — L (mod p) and are therefore all of the roots of 
ar 1 (mod p). That is, 
(8.2.1) ee —-1l= II (c—a) (mod p). 

l<a<p 

(a/p)=1 


In (24.1) we noted that 
2) 1 = (e@-1)(w—2)---(e@—(p—1)) (mod p); 


that is, the p— 1 roots of z?-'-1= (a= - 1)(a°= +1) (mod p) are precisely 
the reduced residues mod p, each occurring exactly once. Since the set of reduced 
residues mod p is the union of the set of quadratic residues and the set of quadratic 
non-residues, we can divide this last equation through by (8.2.1), to obtain 


(8.2.2) or +1 II (a—b) (mod p). 
1<b<p 
(b/p)=-1 


This implies that if b is a quadratic non-residue mod p, then fs SG (mod p); 
that is, b= =—1= (2) (mod p). 


We can use Euler’s criterion to determine the value of Legendre symbols as 
follows: (3) = 1 since 3° = 27? = 1? = 1 (mod 13), and (4) = —1 since 
2° = 64 = —1 (mod 138). 


Exercise 8.2.2. Let p be an odd prime. Explain how one can determine the integer (4) by 


at 
knowing a> (mod p). (Euler’s criterion gives a congruence, but here we are asking for the 
value of the integer (),) 


Exercise 8.2.3. Use Euler’s criterion to reprove Theorem 


Proof #3 of Euler’s criterion. Let g be a primitive root mod p. We have 
g°? = —1 (mod p) by exercise [7.5.2] Suppose that a = g” (mod p) for some 
integer r, so that a> = (g")= = (g°=)? = (-1)" (mod p). If a is a quadratic 
residue mod p, then r is even by exercise B.1.6] and so a*= = (—1)" =1 (mod p). 


If a is a quadratic non-residue mod p, ben r is odd, and so ar = (-1)"=- 


(mod p). 


Square roots and non-squares modulo p. How can we tell whether a reduced 
residue a (mod p) is a square mod p? One idea is to try to find the square root, but 
it is not clear how to go about this efficiently (for example, try to find the square 
root of 77 (mod 101)). One consequence of Euler’s criterion is that one does not 
have to try to find the square root to determine whether a given residue class is a 
square mod p. Indeed one can determine whether a is a square mod p by calculating 
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as (mod p). This might look like it will be equally difficult, but we have shown 
in section 7.13 of appendix 7A that one can calculate a high power of a mod p quite 
efficiently. 


There are some special cases in which one can determine a square root of a 
(mod p) quite easily. For example, when p = 3 (mod 4): 


Exercise 8.2.4. Let p be a prime = 3 (mod 4). Show that if (¢) =landb=a't (mod p), 
then b? =a (mod p). (This idea is explored further in section of appendix 7C.) 

However if p = 1 (mod 4), then it is not so easy to determine a square root. 
For example, —1 is a square mod p (as we will prove in the next section) but we do 
not know a simple practical way to quickly determine a square root of —1 (mod p). 


How can one quickly find a quadratic non-residue mod p? One would think it 
would be easy, as half of the residues mod p are quadratic non-residues, but there 
is no simple way to guarantee finding one quickly. In practice it is most efficient 
to select numbers in [1,p — 1] at random, independently. The probability that any 
given selection is a quadratic residue is $3 so the probability that every one of 
the first k choices is a quadratic residue is 1/2". Therefore, the probability that 
none of the first 20 selections is a quadratic non-residue mod p is less than one 
in a million. Moreover it is easy to verify whether each selection is a quadratic 
residue mod p, using Euler’s criterion. This algorithm will almost always rapidly 
determine a quadratic non-residue mod p, but one might just be terribly unlucky 
and the algorithm might fail. 


It is useful to determine for which primes p a given small integer a is a quadratic 
residue (mod p). We study this for a = —1, 2, and —2 in the next few sections. 


8.3. The residue —1 


Theorem 8.3. If p is an odd prime, then —1 is a quadratic residue (mod p) if and 
only if p=1 (mod 4). 


We will give five proofs of this result (even though we don’t need more than 
one!) to highlight how the various ideas in the book dovetail in this key result. It 
is worth recalling that in exercise [7.4.3(c) we showed that if p = 1 (mod 4), then 
(2)! is a square root of —1 (mod p). We developed more efficient ways of finding 
a square root of —1 (mod p) in section 7.21 of appendix 7C. 


Proof #1. Euler’s criterion implies that (}) = (—1)*= (mod p). Since each 
side of the congruence is —1 or 1, and p, which is > 2, divides their difference, they 


must be equal and so ($) = (—1)*, and the result follows. 


Proof #2. In exercise [7.5.2] we saw that —1 = g—))/? (mod p) for any primitive 
root g modulo p. Now if —1 = (g*)? (mod p) for some integer k, then po = 2k 


(mod p — 1), and there exists such an integer k if and only if 2 is even. 
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Proof #3. The number of quadratic non-residues (mod p) is 25+ 


2: 3 
Wilson’s Theorem, we have 


G)=(5)= Ih G)=c0 


a (mod p) 


and so, by 


Proof #4. If a is a quadratic residue, then so is 1/a (mod p). Therefore we may 
“pair up” the quadratic residues (mod p), except those for which a = 1/a (mod p). 
The only solutions to a = 1/a (mod p) (that is, a? = 1 (mod p)) are a = 1 and 
—1 (mod p). Therefore the product of the quadratic residues mod p is congruent 
to —(—1/p). On the other hand the roots of oer —1 (mod p) are precisely the 
quadratic residues mod p, and so, taking x = 0 in (8.2.1), the product of the 


oS 


quadratic residues mod p is congruent to (—1)(—1)"= (mod p). Comparing these 
yields that (—1/p) = tes (mod p), and the result follows. 


Proof #5. (Euler) The first part of Proof #4 implies that 


—l 
— = #{a (mod p) : a is a quadratic residue (mod p)} 


has the same parity as 


1 —1 
#{a € {1,-1}: ais a quadratic residue (mod p)} = 5 (s + (=)) : 
Pp 


Multiplying through by 2 yields p = (=) (mod 4), and the result follows. 


Theorem [8.3] implies that if p= 1 (mod 4), then (=) = (); and if p= —1 
(mod 4), then (+) =— (z). 
P P 
Exercise 8.3.1. Let p be a prime = 3 (mod 4), which does not divide integer a. Prove that either 


there exists « (mod p) for which x? = a (mod p) or there exists y (mod p) for which y? = —a 
(mod p), but not both. 


Exercise 8.3.2. (a) Prove that every prime factor p of 4n? + 1 satisfies p = 1 (mod 4). 
(b) Deduce that there are infinitely many primes = 1 (mod 4). 


8.4. The residue 2 
Calculations reveal that the odd primes p < 100 for which (2) = 1 are 


p = 7,17, 23, 31,41, 47, 71, 73, 79,89, and 97. 


These are exactly the primes < 100 that are = +1 (mod 8). This observation is 
established as fact as follows: 


Theorem 8.4. If p is an odd prime, then 


(=) - 1 ifp=lor—1 (mod 8), 
D —-l1 ifp=3 or —3 (mod 8). 
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Proof. We will evaluate the product 


S:= II m (mod p) 


1<m<p-1 
m even 


in two different ways. First note that each m in the product can be written as 2k 
with l1<k< as and so 


_TT agit (P— 1; 
ee ( 5 ). 


One can also rewrite each m in the product as p—n where n is odd; and if m is in 
the range uel <m<p-1,thenl<n< bo. Therefore 


S= II m- II (p—n). 


1<m< 2+ I<n<®p 
m even n odd 
Let’s suppose there are r such values of n, and note that each p—n = —n (mod p). 
Therefore 
p-1l 
S= m: n) = (-1)" ! (mod p). 
iat TT =a (5+)! (anea 2) 
1<m<2H 1<n<2t 
m even n odd 


Comparing the two ways that we have evaluated S$, and dividing through by (25+)!, 
we find that 4 

2° =(-1)" (mod p). 

The result follows from Euler’s criterion and verifying that r is even if p = +1 
(mod 8), while r is odd if p= +3 (mod 8) (see exercise [8.4. 1). 


Exercise 8.4.1. For any odd integer q, let r denote the number of positive odd integers < Lo 
Prove that r is even if g= +1 (mod 8), while r is odd if g= +3 (mod 8). 


Gauss’s Lemma (Theorem [8.6] in appendix 8A) cleverly generalizes this proof 
of Theorem [8.4] to classify the values of (:) for any fixed integer a. 
Calculations reveal that the odd primes p < 100 for which (=) = 1 are 
p = 3,11,17,19, 41, 43, 59, 67, 73, 83, 89, and 97. 


These are exactly the primes < 100 that are = 1 or 3 (mod 8). This observation is 
established as fact by combining Theorems [8.3] and [8.4] which allow us to evaluate 


(=2) by taking (=2) = (+) (2) for every odd prime p. 
Exercise 8.4.2. Prove that if p is an odd prime, then 
(2)-{2 ifp=1lor3 (mod 8), 
p -1 ifp=5or7 (mod 8). 
Exercise 8.4.3. Prove that if 2 is a primitive root mod p, then p = 3 or 5 (mod 8). 


Exercise 8.4.4.1 (a) Prove that if prime p|M, := 2” — 1 where n > 2 is prime, then p = 1 
(mod n) and p= +1 (mod 8). 
(b) Prove that if p = 2n + 1 is prime, then p|2” — 1 if and only if p= +1 (mod 8). 
+3 (mod 8). 


(c) Prove that if p= 2n +1 is prime, then p|2” + 1 if and only if p= 
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(d) Prove that if g and p = 2q+1 are both prime, then p divides 27 — 1 if and only if q = 3 
(mod 4). 
(e) Factor 21! — 1 = 2047. 
Exercise 8.4.5.1 In exercise [7.3.2] we proved that if prime p divides 92" +1, then p= 1 
(mod 2*+1), Now show that p= 1 (mod 2*+?) if k > 2A 
8.5. The law of quadratic reciprocity 


We have already seen that if p is an odd prime, then 
(=) jl ifp=1 (mod 4), 
pj)  )-1 ifp=-1 (mod 4) 
(2) - 1 ifp=lor —1 (mod 8), 
p) \-1 ifp=3o0r —3 (mod 8). 


To be able to evaluate arbitrary Legendre symbols we will also need the law of 
quadratic reciprocity. 


and 


Theorem 8.5 (The law of quadratic reciprocity). If p and q are given distinct 
odd primes, then 


(2) (2) - 1 ifp=1 (mod 4) or gq=1 (mod 4), 
q pj |=t if p=q=-1 (mod 4). 


These rules, taken together, allow us to rapidly evaluate any Legendre symbol. 
For example, to evaluate (m/p), we first reduce m mod p, so that (m/p) = (n/p) 
where n = m (mod p) and |n| < p. Next we factor n and, by the multiplicativity of 
the Legendre symbol, we can evaluate (n/p) in terms of (—1/p), (2/p) and the (q/p) 
for those primes q dividing n. We can easily determine the values of (—1/p) and 
(2/p) from determining p (mod 8), and then we need to evaluate each (q/p) where 
q < |n| < p. We do this by the law of quadratic reciprocity since (q/p) = +(p/q) 
depending only on the values of p and g mod Af] We repeat the procedure on each 
(p/q). Clearly this process will quickly finish as the numbers involved are always 
getting smaller. Let us work through some examples. 


111 40 ae es _ ‘ 
(2) = (2) = (4) (=) as 111=40 (mod 71) and 40 = 2°-5, 


= 1s 1s (5) as 71=-—1 (mod 8) and5=1 (mod 4), 


5 


= (=) =] as 71=1 (mod 5). 


?We can use this to “demystify” Euler’s factorization of Fs: Exercise[8.4.5Jimplies that any prime 
factor p of Fs must be of the form 128m +1. This is divisible by 3, 5, and 3 for m = 1, 3, and 4, 
respectively, so is not prime. If m = 2, then p = Fy which we proved is coprime with Fs in section 
Finally, if m = 5, then p = 541 is a prime factor of F5. 

’Note that if (2) (4) = 7 (= £1) by the law of quadratic reciprocity, then (4) =n (2). 


P 
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There is more than one way to proceed with these rules: 


111 mee _ 
(32) = (3) (3) as 111 =~31 (mod 71), 


= (—1)-(-1)- (3) as 71=31=-1 (mod 4), 


= a = J ° = 1 as 71=9= 37 (mod 31) 
ey AS —— , 


A slightly larger example is 
869\  (247\ (13 19. 1 311 (-1) 311 
311) ~ \ 311) ~ \ 311 ali) 13 19 
—1 7 = -1-1-(-1) 19 _ —2 — 
13 19 7 7 


Although longer, each step is straightforward except when we factored 247 = 13x19 
(a factorization which is not obvious for most of us, and imagine how difficult 
factoring might be when we are dealing with much larger numbers). Indeed, this 
is an efficient procedure provided that one is capable of factoring the numbers n 
that arise. Although this may be the case for small examples, it is not practical for 
large examples. We can bypass this potential difficulty by using the Jacobi symbol, 
a generalization of the Legendre symbol, which we will discuss in section 8.7] 


In the next subsection we will prove the law of quadratic reciprocity, justifying 
the algorithm used above to determine the value of any given Legendre symbol. 

The law of quadratic reciprocity is easily used to determine various other rules. 
For example, when is 3 a square mod p? This is the same as asking when (3/p) = 1. 
Now by quadratic reciprocity we have two cases: 

e Ifp=1 (mod 4), then (3/p) = (p/3), and (p/3) = 1 when p= 1 (mod 3), so 
we have (3/p) = 1 when p= 1 (mod 12) (using the Chinese Remainder Theorem). 

e If p = —1 (mod 4), then (3/p) = —(p/3), and (p/3) = —1 when p = 
—1 (mod 3), so we have (3/p) = 1 when p = —1 (mod 12) (using the Chinese 
Remainder Theorem). 


We have therefore proved that (3/p) = 1 if and only if p= 1 or —1 (mod 12). 


Exercise 8.5.1. Determine (a) (32); (b) (323); (c) (32); (d) (33); (e) =). 


Exercise 8.5.2. (a) Show that if prime p= 1 (mod 5), then 5 is a quadratic residue mod p. 
(b) Show that if prime p = 3 (mod 5), then 5 is a quadratic non-residue mod p. 
(c) Determine all odd primes p for which (5/p) = —1. 


Exercise 8.5.3. Prove that if p := 2" — 1 is prime with n > 2, then (3/p) = —1. 


m Fm-1 
Exercise 8.5.4.' Suppose that Fy, = 22 +1 with m > 2 is prime. Prove that 37 2 = 


Fm -1 
5. 2 =-1 (mod Fy). 


Exercise 8.5.5.1 (a) Determine all odd primes p for which (7/p) = 1. 
(b) Find all primes p such that there exists 2 (mod p) for which 2x? — 2a — 3 =0 (mod p). 
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Exercise 8.5.6. Show that if p and q = p+ 2 are “twin primes”, then p is a quadratic residue 
mod q if and only if g is a quadratic residue mod p. 


Exercise 8.5.7. Prove that (—3/p) = (p/3) for all primes p. 


8.6. Proof of the law of quadratic reciprocity 


Suppose that p < q are odd primes, and let n = pq. Given residue classes a 
(mod p) and b (mod q) there exists a unique residue class r (mod n) for which 
r =a (mod p) and = b (mod q), by the Chinese Remainder Theorem. Let r(a, b) 
be the least residue of r mod n in absolute value and let m(a,b) = |r(a, 6)|, so that 
1 < m(a,b) < n/2, and m(a, b) = r(a,b) or —r(a,b). We claim that 


—1 
{m(at):tsasp-tandtrsos 7h = {mites 5 with (mn) = 1, 


since the two sets both have ¢(n)/2 elements, each such m(a,b) € [1, }) with 
(m,n) = 1, and the m(a, b) are distinct. This last assertion holds or else if m(a, b) = 
m(a’,b’), then r(a,b) = +r(a’,b’) (mod n), so that b= +b’ (mod gq). As1<6,b' < 
at this implies that b = b’ so that the sign is “+”, and therefore a = a’ (mod p) 
implying that a =a’. 


Since each m(a,b) = +r(a,b), we deduce that there exists 0 = —1 or 1 such 
that 
(8.6.1) a II r(a,b) = II m(a, b) = II m. 
l<a<p-1 l<a<p-1 l<m<n/2 
1<b< 45+ 1<b< 45+ (m,n)=1 


We will calculate the two sides in this identity, mod p and mod q, and compare. 
As r(a,b) =a (mod p) the product on the left-hand side of (8-6-1) is 


I] “@o2- JI I] «=@-v!** =(-1)* (mod p), 


l<a<p-1 1<p< 421 = l<a<p-1 
1<bs 45+ _ 


using Wilson’s Theorem. We rewrite the right-hand side of (8.6.1), multiplying top 
and bottom by the integers m € [1, }) that are divisible by q, to obtain 


I m/ i @ 
1l<m<n/2 1<m<n/2 
(m,p)=1 alm 


We partition the m’s in the numerator into intervals of length p, because 


p-1l p-1 
I m= [[@+a = I = (p—1)! = -1 (mod p), 
ip<m<(itl)p j=1 j=l 


(m,p)=1 
by Wilson’s Theorem. Applying this for 0 <i < 13 we get a contribution of 


(aye to the numerator. The remaining integers in the numerator contribute 


(p—1)/2 gad (p—1)/2 rs | 
= rs = ; = eM 
II m II ( 5 p+i) = Ul —— ( 5 ): (mod p). 
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On the other hand the m’s in the denominator can be written as gk with 1 <k< 


p-l 
PS and so 


UL re A, ge TG) Ca) ee 


1<m<n/2 1<k< 25+ 
q|m ° 


by Euler’s criterion. Cancelling the (7+)! from the numerator and denominator, we 


deduce that the right-hand side of (&:6.1) is = (—1)“= (4) (mod p). Comparing 
our calculation of the left- and right-hand sides of (8.6.1) mod p, we obtain 


(8.6.2) o(-1)°F = (-1)° @ (mod p). 
Pp 
Since both sides are 1 or —1 and are congruent mod p, they must be equal and so 


we deduce that 
a0 
go=({-). 
Pp 


Next we reduce (8.6.1) mod qg. For the right-hand side we proceed entirely 
analogously to how we did mod p, with the roles of p and q reversed and so obtain 


(-1)"F* (2) (mod q). 


For the left-hand side of (86.1) mod q , we note that each r(a,b) = b (mod gq), 
so that 


—1 pat 
TH n= TE TL 6=((4*)!) Goa) 
l<a<p-1 l<a<p-1 1<o< t+ 
1<b< 43} OS 


In exercise [7.4.3] we saw (, 27) ,) = (1 + (mod q) | and therefore 
(q-1)/2 


by Wilson’s Theorem. Therefore 


I] reso) (54) ))7 scares (mod q). 


l<a<p-1 
1<b< 45° 


Substituting this and the above into (8.6.1) we obtain 


(8.6.3) (2) (-1) (re ap (2) (mod q). 


Again both sides are 1 or —1 and are congruent mod q, so must be equal. Multi- 


plying both sides through by (—1)’=~ (4) implies that 
P 


4See the solution to exercise [7.4.3] at the end of the book for a proof. 
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From here we work through the four cases for p and g mod 4 and deduce the law 
of quadratic reciprocity (Theorem [8.5). 


There are many proofs of the law of quadratic reciprocity, 246 at the last 
count (see the list at http://www.rzuser.uni-heidelberg.de/~hb3/fchrono. 
html). In this chapter’s appendices we present two of the best: the original proof 
due to Gauss and an elegant proof due to Eisenstein. We also discuss two other 
proofs in the exercises and then two sophisticated but shorter proofs in chapter 14. 


8.7. The Jacobi symbol 


The Jacobi symbol is defined as follows: If m is a positive odd integer, we write 
m= ie p®, where the p are distinct odd primes, and then 


This is defined only for odd m, not for even m. 

If a is a square modulo m, then, by the Chinese Remainder Theorem, a is a 
square modulo every prime p dividing m; that is, (a/p) = 0 or 1 for all p|m and so 
(a/m) = 0 or 1. However the converse is not always true; for example, 2 is not a 
square mod 15 as 


2 2 . . . 2 2 2 
2)=()--t oes (2) (2)()- 


Exercise 8.7.1. Suppose that m is an odd positive integer. 


(a) Prove that (+) = (4) whenever a = b (mod m). 


m 
(b) Prove that (2) = (+) (4). 
(c) Prove that if (&) = —1, then a is not a square mod m. 
(d) Prove that (+) =0 if and only if (a,m) > 1. 


Exercise 8.7.2. (a) Prove that ier (£) =0 for every non-square odd integer m > 2. 
= m 
(b) For how many residues a mod m do we have (a/m) = 1? 
(c) For how many residues a mod m do we have (a/m) = —1? 


Exercise 8.7.3. Show that if n > 1, then (as) =1. 


Theorems[8.3] [8.4] and[8.5]can all be extended to the Jacobi symbol (as we will 
prove at the end of this section): If m and n are odd, coprime integers > 1, then 


-1\ Jl ifn=1 (mod 4), 
(8.7.1) (=) ~ ty ifn=-—1 (mod 4), 


(8.7.2) (=) _}i 7 n=lor —1 (mod 8), 
—-1 ifn=3o0r —3 (mod 8), 


and the law of quadratic reciprocity 


(8.7.3) (“) (=) af-ayo 7, 
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We can use these three rules to easily evaluate (m/n) for any odd coprime 
integers m and n. One begins by selecting M = m (mod n) as conveniently as 
possible, usually with |M| <n. Then we factor M = +2*é where ¢ is an odd 


positive integer <n, so that (#) = (“4) = (=) ew (£). We can evaluate the 


n n n 
first two Jacobi symbols using the first two rules above (which depend only on the 
value of n (mod 8)), and then we know that (£) = +(%) by the third rule. To 
evaluate (4) we repeat this process, but now with a smaller pair of numbers, so 


that the algorithm will terminate after finitely many steps. 


This algorithm only involves dividing out powers of 2 and a possible minus sign, 
so it goes fast and avoids serious factoring; in fact it is guaranteed to go at least as 
fast as the Euclidean algorithm since it involves very similar steps|*| Here is a first 
straightforward example using the Jacobi symbol, instead of the Legendre symbol: 


106\ — /35\ | TL Ly ; 
Wy \tiy  Bby CB 
(Note that (71/35) is not the Legendre symbol as 35 is not prime, but it is a Jacobi 


symbol.) Now let’s revisit the example (8°2) from section and avoid factorin 
y g 


311 
247: 


Gr) 7 Gi) re Ga) ~ (a) a 


We did not need to factor 247, and each step of the algorithm was straightforward. 


Exercise 8.7.4. Determine (a) (3); (b) (333); (c) (333); (d) (54%). 


Proof of (8.7.1), (8.7.2), and (8.7.3). We proceed by induction on the number of 
prime factors of m and n. The results follows when m and n have one prime factor 
by Theorems and respectively. Otherwise we write n = ap for some 
prime p dividing n (swapping the roles of m and n if necessary). 


Exercise 8.7.5. Prove that 251 4+ 2=! = 22=1 (mod 2) for any odd integers a, b. 
2 2 2 y 


Equation (8.7.1) can be rephrased as (—*) = (—1)*=". By induction, using the 
multiplicativity of the denominator of the Jacobi symbol, 


a) =) (G2) G) rn 9 on 


by exercise [8.7.5] 


Similarly by induction and multiplicativity of the numerator and denominator, 


O)-)@)-O8) @O@-O8-@@ 


= (-1) 3 3 3 2 =(-1) 2 2 =(-1) 2 By 
by exercise [8.7.5] 


5 As in the “speeded up” version of the Euclidean algorithm, given in section of appendix 1B. 
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If (2) = (3), then a = +p (mod 8), so that n = ap = +1 (mod 8), and 
*\(2 2), then a = +3p (mod 8), so 


therefore (2) = (2)(2) = (41)? = 1. If (2) = -( 
= +3 (mod 8), and therefore (2) = Oe) = (1)(-1) = -1. 


that n = ap 


Gauss gave a different proof of (8.7.2), tying the question directly into finding 
solutions to quadratic equations. This foreshadows Gauss’s proof of the full law of 
quadratic reciprocity, which we will give in appendix 8C. 


Gauss’s induction step for integers n = +3 (mod 8). Wesuppose that (8.7.2) 
is true for all odd integers m < n and that n = +3 (mod 8). If n = ab is composite 
with 1 <a,b<n, then (2) = (2) (2) and the result for n follows by applying the 


induction hypothesis with m = a and with m = b. 


Therefore we may suppose that n = p is prime and assume that GC) =1. Leta 
be the smallest odd positive integer for which a? = 2 (mod p) so that 1 <a<p-—1 
(for if b is the smallest positive integer for which b? = 2 (mod p), then let a = b if b 
is odd, and a = p—Dif b is even), and write a?—2 = pr. Evidently pr = a?—2 = —1 
(mod 8) and so r = p*r = p(pr) = —p = +3 (mod 8). Now a? = 2 (mod r) and so 
(2) =1 with r= a < pand r = +3 (mod 8). This contradicts the induction 


hypothesis, and so our assumption is wrong. Therefore GC) =-1. 


Exercise 8.7.6. Prove an analogous induction step for integers n = 5 or 7 (mod 8) when estab- 
lishing the value of (=). 


Exercise 8.7.7 (A useful reformulation of the law of quadratic reciprocity). For a given odd, 
squarefree integer n > 1 let n* = (=) n. Prove that n* = 1 (mod 4) and that we have (%) — 


(=) for all odd integers m > 1. 


8.8. The squares modulo m 


To determine the squares mod m, that is, the residues a (mod m) for which there 
exists b (mod m) with b? = a (mod m), we may use the Chinese Remainder The- 
orem: We know that a is a square mod m if and only if a is a square modulo every 
prime power factor of m. So it is sufficient to understand the squares modulo every 
prime power. 

Above we have understood the squares modulo every prime p. We now “lift” 
these squares to determine the squares modulo every prime power, p*. Let’s begin 
by studying the squares mod p?: 

The squares mod 9 are 0, 1, 4, and 7 mod 9 (these are the least residues of 
07,1°,...,8? (mod 9), excluding repetitions). The non-zero residues, 1, 4, and 7 
are all = 1 (mod 3); in fact they are all of the residue classes a (mod 9) for which 
a=1 (mod 3). We have seen that 1 (mod 3) is the only quadratic residue mod 3. 


Similarly mod 25 we have the squares 


0,1,4,9, 16, 11,24, 14,6,21,and 19 (mod 25). 
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The non-zero squares here are 1, 6, 11, 16, and 21 (mod 25), the residue classes a 
(mod 25) for which a = 1 (mod 5), and 4, 9, 14, 19, and 24 (mod 25), the residue 
classes a (mod 25) for which a = 4 (mod 5). Moreover 1 and 4 (mod 5) are the 
quadratic residues mod 5. 


A pattern begins to emerge. Define a to be a quadratic residue (mod m) if 
(a,m) = 1 and there exists b (mod m) for which b? = a (mod m). 


Proposition 8.8.1. Let p be a prime. If r is a quadratic residue mod p*, then r 
is a quadratic residue mod p**! whenever k > 1, except perhaps when p* = 2 or 4. 


Proof. There exists an integer x for which x? = r (mod p*), and (x,p) = 1 as 
(r,p) =1. We let n be that integer for which 2? = r + np*. 


Now if p is odd, then, for any integer j7, we have 


(a — jp")? = x” — 2jap* + j°p* =r+(n—2jx)p* (mod p**). 

This is = r (mod p**") if and only if 272 = n (mod p), which holds if and only 

if 7 = n/2x (mod p) (as (2z,p) = 1). Therefore r is a square mod p**1, and our 

proof yields that there is a unique X (mod p**) for which X = x (mod p*) and 

X*=r (mod p**!), namely X = a2 — jp* (mod p*t') where j = n/2x (mod p). 
If p= 2, then r? =r +n- 2* and z is odd so that x? — nx2* =r (mod 2**4). 

Therefore 


(o—n2?-")? = 2? —na2* 4+n22"*-" =r (mod 2°"), 


provided the exponent 2k — 2 >k+1; that is, k > 3. 


Exercise 8.8.1. Deduce that an integer r is a quadratic residue mod p* if and only if r is a 
quadratic residue mod p, when p is odd, and if and only if r= 1 (mod gcd(2",8)) when p = 2. 


This implies that exactly half of the reduced residue classes mod p* are qua- 
dratic residues, when p is odd, and exactly one quarter when p = 2 and k > 3. 


Using the Chinese Remainder Theorem we therefore deduce from exercise|8.8.1 
the following: 
Corollary 8.8.1. Suppose that (a,m) =1. Then a is a square mod m if and only 
if (3) = 1 for every odd prime p dividing m, anda =1 (mod gced(m,8)). 
Exercise 8.8.2. Suppose that (a,n) = 1 and that b? =a (mod n). Prove that the set of solutions 


x (mod n) to 2? = a (mod n) is given by the values br (mod n) as r runs through the solutions 
to r? =1 (mod n). (Determining the square roots of 1 (mod n) is discussed in section B.8]) 


Additional exercises 


Exercise 8.9.1. Let p be an odd prime where p{a. Show that the congruence ax? + ba +c=0 
(mod p) has a solution 2 (mod p) if and only if b? — 4ac is a square mod p. 


Exercise 8.9.2.1 Prove that m? and m? +1 are both squares mod p, for m equal to at least one 
of a, a+1, or a2 +a+1, for any integer a. (This generalizes exercise a).) 


Exercise 8.9.3. The polynomial «+ — 4x? + 1 is irreducible over Q[2] by TheoremB.4] 
(a) Prove that #*—4x?+1 can be factored mod p as (x? — a) (x? — 8) or (a? —ax+1)(a? +ax+1) 
or (2? — az +1)(x? + az +1) if 3 or 6 or 2 is a square mod p, respectively. 


Questions on squares mod m, and the Legendre symbol 163 


(b) Deduce that «+ — 4%? + 1 (mod p) is reducible for every prime p. 
(c)t Prove that every quadratic polynomial of the form «++ ax? +b? factors into two quadratics 
mod p, for every prime p. 


Exercise 8.9.4. Prove that if p= 1 (mod 4), then x4 +4 factors into four linear factors mod p. 


Exercise 8.9.5. Let f(.) be the totally multiplicative function for which f(3) = 1 and f(p) = (4) 
ifpA3. 
(a) Give a formula for f(n) for an arbitrary integer n. 
(b)? For any given large constant B, suppose that p is a prime for which (q/p) = f(q) for every 
prime q < B. Show that there are no three consecutive squares mod p that are all < B. 


This shows that the result in exercise b) cannot be extended to three consecutive integers 
provided the hypothesis in (b) holds. This hypothesis will be justified in exercise[8.17.2]of appendix 
8D. 


A . = d\ _— 
Exercise 8.9.6. Show that if (2) = —1, then Dain (4) =0. 
Exercise 8.9.7. Suppose that a and b are integers and {r, : n > 0} is the second-order linear 
recurrence sequence given by (0.1.2) with zo = 0 and 2; = 1. Using exercise[0.4.10{b) prove that 


if odd prime p divides some zy with n odd, then (—b/p) = 1. Deduce that if (—b/p) = —1 and p 
divides xn, then n is even. 


Exercise 8.9.8. (a) Suppose that p* is an odd prime power. Prove that there are 1 + (s) 
residue classes b (mod p*) for which b? = a (mod p*) . 
(b) Suppose that n is an odd positive integer. Prove that there are II, pine: op (1 + (s)) 


residue classes b (mod n) for which b? = a (mod n). 


(c) Show that this equals ale (S$) where the sum is restricted to squarefree integers d. 


Exercise 8.9.9.' Let p be a given odd prime. 
(a) Prove that for every m (mod p) there exist a and b mod p such that a? +b? = m (mod p). 
(b) Deduce that there are three squares, not all divisible by p, whose sum is divisible by p. 
(c) Generalize this argument to show that if a, b, and c are not divisible by p, then there are 
at least p solutions 2, y,z (mod p) to ax? + by? + cz? =0 (mod p). 


Exercise 8.9.10.1 Let m be a squarefree integer 4 1, and let a be an odd positive integer. 
(a) Prove that the Jacobi symbol (4*) is a periodic function of a of period dividing 4m. 
(b) Show that the Jacobi symbol (32) has minimal period 12. 


(c) Prove that if m is odd and (a,2m) = 1, then (4%) = (=) (#*). 


Now suppose that m = 3 (mod 4). 
(d) Prove that there exists an integer r for which (4%) = —1. 


(e) Prove that 47, (42) =0. 


Exercise 8.9.11. (This extends exercise[8.2.4]) 
(a) Let n = pq where p and q are distinct primes = 3 (mod 4), and m = 


Show that if (¢) = (4) = 1 and b=a™ (mod n), then b? =a (mod n 
(b) Any odd prime p can be written uniquely in the form p = 1 + 2*m where m is odd and 


peg et 40, 
ya 


m+1 


k > 1. Prove that if a is a 2*th power mod pandb=a 2 (mod p), then b? =a (mod p). 


If prime p = 1 (mod 4) and (a/p) = 1 but a is not a fourth power mod p, then we do not know how 
to use this idea to find a square root of a (mod p). Known methods in this case are considerably 


more complicated (see, e.g., [CP05)). 
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Exercise 8.9.12. Suppose that p is a prime = 3 (mod 4) and (4) = 1. Prove that there are 


exactly two solutions 2 (mod p) to x* = b (mod p). 


Exercise 8.9.13. Show that if p is a prime which divides m? — 15 for some integer m, then 
either p = 2, 3, or 5, or p = +1,+7,411, or +17 (mod 60). 


Exercise 8.9.14.' Show that if p is a prime = 1 (mod 4), then —1 is a fourth power (mod p) if 
and only if 2 is a square mod p. 


Exercise 8.9.15. If (a,n) = 1, then multiplication by a (mod n) generates a permutation 
of the reduced residues mod n. For example for 3 (mod 7) we get the permutation 03,7 := 
(1,3, 2,1, —3, —2), whereas for 2 (mod 7) we get the permutation 02,7 := (1,2, 4)(3, 6,5). Prove 
that if p is prime and (a, p) = 1, then the signaturd®] of the permutation 


(oa) = (4). 


Exercise 8.9.16. (a) Prove that (354) = 0 if (m,n) > 1. 


(b) Suppose that n = mq+r where n >m>r > 2. Prove that (374) =-— (33). 
a S 


(c)? Prove that if n/m = [ao,a1,...,a%] with (n,m) = 1 and a, > 2, the (+54) =(-1)*41, 


Infinitely many primes. 


Exercise 8.9.17.' Fix odd, squarefree integer n > 1. Prove that there are infinitely many primes 
p for which (p/n) = —1. 


Exercise 8.9.18.' Let n be a squarefree integer. 
a) By considering the prime divisors of m? — n, for well-chosen values of m, prove that there 
are infinitely many primes p for which (n/p) = 1. 

) Deduce that there are infinitely many primes = 1 (mod 3). 
) Refine this to deduce that there are infinitely many primes = 7 (mod 12). 
) Prove that there are infinitely many primes = 11 (mod 12). 

(e) Prove that there are infinitely many primes = 5 (mod 8). 
) Prove that there are infinitely many primes = 7 (mod 8). 
) Prove that there are infinitely many primes = 3 (mod 8). 
) Prove that there are infinitely many primes = 5 (mod 12). 


Exercise 8.9.19.' Fix odd, squarefree integer n > 1. Using exercises [8.9.18[a) and [8.7.7] prove 
that there are infinitely many primes p for which (p/n) = 1. 


In Ram Murty’s undergraduate thesis (1976, Carleton University, Ottawa) he 
defined a Euclidean proof that there are infinitely many primes = a (mod q) to be 
one in which we use a polynomial all of whose prime divisors either divide q or are 
= 1 ora (mod gq). Several of the proofs for the different arithmetic progressions in 
the last three questions can be formulated in this way. We gave such a proof for 
a =1 in Theorem[7.8] Murty went on to show that there is a Euclidean proof that 
there are infinitely many primes = a (mod q) if and only if a2 = 1 (mod q) (as in 
all our examples here). To prove that there are infinitely many primes = 2 or = 3 
(mod 5), or 5 (mod 7), etc., we will have to develop other techniques. 


6 Any permutation can be described by a sequence of transpositions (swaps) of pairs of elements. 
Although the sequence, and even the number of swaps in such a sequence is not unique, the parity of 
the number of swaps is. This is called the signature of the permutation and is given by —1 or 1 (for an 
odd or even number of transpositions, respectively). 
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Further reading on Euclidean proofs 


[1] M. Ram Murty and N. Thain, Primes in certain arithmetic progressions, Funct. Approx. Comment. 
Math. 35 (2006), 249-259. 


Primitive roots for specially chosen primes. 


Exercise 8.9.20.1 Suppose that q and p = 2q +1 are odd (Sophie Germain twin) primes. 
(a) Show that if p= 3 (mod 8), then 2 is a primitive root mod p (e.g., 11, 59, 83, 107,...). 
(b) Show that if p= 7 (mod 8), then —2 is a primitive root mod p. 
(c) Prove that —3 is a primitive root mod p, but 3 is not. 


Exercise 8.9.21.1 Suppose that q and p = 4q + 1 are odd primes. Prove that 2, —2, 3, and —3 
are all primitive roots mod p. 


Exercise 8.9.22.1 Suppose that the Fermat number Fy = 22” +1 is prime with m > 1. Prove 
that if (q¢/Fm) = —1, then q is a primitive root mod Fm. (We deduce that 3 and 5 (for m > 1) 
are primitive roots mod Fi, by exercise[8.5.4]) 


Alternate proofs of the value of (2/n). 


Exercise 8.9.23. Let p be a prime = 1 (mod 4) so that there exists a reduced residue r (mod p) 
such that r? = —1 (mod p). 
(a) By expanding (r +1)? (mod p) prove that 2 is a square mod p if and only if r is a square 
mod p. 
(b) Prove that r is a square mod p if and only if there is an element of order 8 mod p. 
(c) Use Theorem [f.6]to deduce that 2 is a square mod p if and only if p=1 (mod 8). 


Exercise 8.9.24 (Proof of (87.2)). By induction on odd n > 1. By the law of quadratic reci- 
procity, as stated in (8.7.3), we have 


et He ae a 


as one of n and n— 2 is = 1 (mod 4). Complete the proof. 


Exercise 8.9.25. Every odd prime p may be written in the form p = 4k + 0 with 0 = ($+). 


We will show that (2) = (—1)* which implies TheoremB.4] Let m = 2k+o so that 2m =pto. 
Verify that 


a a ae 


and deduce the result from here. 


Further proofs of the law of quadratic reciprocity. 


Exercise 8.9.26.' (a) In the mid-18th century, Euler conjectured that if m > n are coprime, 
odd, positive integers, then (2) = (2) where m—n = 4a ifm =n (mod 4), and m+n = 4a 
otherwise. Use the law of quadratic reciprocity to prove Euler’s conjecture. 


(b) Use Euler’s conjecture to prove (8.7.3), the law of quadratic reciprocity. 


Scholze (1938) proved Euler’s conjecture using Gauss’s Lemma (Theorem[8.6) and so gave a 
different proof of the law of quadratic reciprocity. 


Exercise 8.9.27.' Finally we present my own variation of Rousseau’s proof of quadratic re- 
cipocity, as a series of (challenging) exercises. Let p < q be odd primes, and let n = pg. 
Let A = Thi<men/2 (m,n)=1 ™: In the proof given of Theorem in section [8.6] we showed 


that A = (3) (2) (mod p) and, analogously, A = (3) (2) (mod q). We now evaluate A 


(mod n) much as in Gauss’s proof of Wilson’s Theorem, where we paired up each residue with its 
inverse: Let S be the set of (unordered) pairs {a,b} € [1, }) for which ab = 1 or —1 (mod n). 
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(a) Prove that the residues a and b are distinct unless a? = 1 or —1 (mod n). 
(b) Prove that if a? = 1 (mod n), then a = 1, —1, r, and —r (mod n) for some r # +1 
(mod n). 
c) Prove that the product of the integers a € [1, 3) with a? = 1 (mod n) is = +r (mod n). 


(f) 


Prove that if b? = —1 (mod n), then p=q=1 (mod 4). In this case: 
e Deduce that the product of the integers b € [1, 5) for which b? = —1 (mod n) is = +r 


(mod n). 
FG) ay Gs: 


e Deduce that A= +1 (mod n). 


e Combine the above to show that (= 


If at least one of p and q is = 3 (mod 4): 
e Deduce that A=+r (mod n). 


e Combine the above to show that (+) (4) = (3) (2). 
Deduce Theorem [8.5] 


Appendix 8A. Eisenstein’s 
proof of quadratic reciprocity 


8.10. Eisenstein’s elegant proof, 1844 
A lemma of Gauss gives a complicated but useful formula to determine (a/p): 


Theorem 8.6 (Gauss’s Lemma). Given an integer a which is not divisible by odd 
prime p, define r,, to be the absolutely least residue of an (mod p), and then define 


the set N :={1<n< ™*: ry <0}. Then (3) = (-1)M1, 


For example, if a = 3 and p= 7, then r; = 3,r2g = —1,r3 = 2 so that N = {2} 
and therefore (2) = (—1)! = -1. 


Proof. For each m,1 << m < po, there is ay one integer n, 1<n< po 
such that r, =m or —m (mod p) (for if an = tan’ (mod p), then ie (n#=n’), and 
so p|n +n’, which is See es in this range only if n = n’). Therefore 
p-l 
(= Toms Tom Tow 
1<m< 2 1<n<25+ 1<n< 25+ 
negN nen 
z =i 
= - re (al (2 d p). 
TL tem): TE Gan) = 081)" (2A) (oa o) 
1<ns 2 I<ns 2 
neN nEeN 


Cancelling out the (= +)! from both sides, the result follows from Euler’s criterion. 


This proof is a clever generalization of the proof of Theorem [8.4] 
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Exercise 8.10.1.' Use Gauss’s Lemma to determine the values of (a) (—1/p) and of (b) (3/p), 
for all primes p > 3. 


Exercise 8.10.2.1 Let r be the absolutely least residue of N (mod p). Prove that the least 
non-negative residue of N (mod p) is given by 


N *] = r ifr > 0, 
ms ~ )ptr ifr <0. 


Corollary 8.10.1. [fp is a prime > 2 anda is an odd integer not divisible by p, 
then 


p-1 


(8.10.1) (<) = (-1)En [F) 


Proof. (Gauss) By exercise [8.10.2] we have 


(8.10.2) s (en—n|@ “|) = so Lx ptrn) )= Sor + INI. 
n=l. n=1 n=1 n=1 
neN nEeN 


In the proof of Gauss’s Lemma we saw that for each m,1 < m < = = , there is 


exactly one integer n, 1 <n < as such that r, = m or —m, and so rn, = m 


(mod 2). Therefore, as a and p are odd, (8.10.2) implies that 


poi p-1 pot poi 
IN| = = A (mod 2) as ie = yn = “yon (mod 2). 


We now deduce (8.10.1)) from Gauss’s Lemma. 


pl 
The exponent >>,,2, [<2] on the right-hand side of (8.10.1) looks excessively 
complicated. However it arises in a different context that is easier to work with: 


Lemma 8.10.1. Suppose that a and b are odd, coprime positive integers. There 
are 


b-1 
2 
~1F 
b 
n=1 
lattice points (n,m) € Z? for which bm < an with 0 <n < b/2. 


Proof. We seek the ce of lattice points (n,m) inside the triangle bounded 
by the lines y = 0, & = BF and by = az. For such a lattice point, n can be any 
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integer in the range 1 <n< — For a given value of n, the triangle contains the 
lattice points (n,m) where m is any integer in the range 0 < m < %. These are 
the lattice points in the shaded rectangle in Figure 8.1. 


2 »  b/2 


Figure 8.1. The shaded rectangle covers the lattice points (n,m) with 1 < 
m < [22]. 


Evidently m ranges from 1 to [$*], and so there are [$*] such lattice points. Sum- 


ming this up over the possible values of n gives the lemma. 


Corollary 8.10.2. Ifa and b are odd coprime positive integers, then 


A} a-1 


he : = (a—1)(b—1) 


7 2 
Proof. The idea is to split the triangle 


R= {(ey): O<e<pando<y< sh 


into two parts: the points in R on or below the line by = az, that is, in the region 


A:={(xa,y): 0<a<b/2and0<y<aza/b}; 
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a/2 


Figure 8.2. Splitting the rectangle R into two parts. 


and the points in R above the line by = az, that is, in the region 
B:={(a,y): 0< a < by/aand0<y<a/2}. 


We count the lattice points (that is, the points with integer coordinates) in R 
and then in A and B together. To begin with 


b-1 —1 
Rn? ={ (nm) €Z?s tens and tems 95}, 


2) _ a-1 , b-1 
so that |RN Z*| = 5° -?>. 


Since there are no lattice points in R on the line by = az, as (a,b) = 1, therefore 


ANZ? = {(n,m) € Z?: 0<n<b/2 and bm <an}, 


b-1 
and so |AN Z?| = 0,7; [4] by LemmaBI01] Similarly 
BOZ = {(n,m) €Z? : 0<m<a/2 and an < bm}, 


a-—1 
and so |BN Z?| = >7,,2, [@] by Lemma [8.10.1] (with the roles of a and 6 inter- 


a 


changed). The result then follows from the observation that AN Z? and BN Z? 
partition RN Z?. 


Eisenstein’s proof of the law of quadratic reciprocity. By Corollary [8.10.1] 
with a = q, and then with the roles of p and q reversed, and then by Corollary 
[8.10.2] we deduce the desired law of quadratic reciprocity: 


(2) (2) = pred [F] (reals) = (1), 


Pp qd 


Appendices. The extended version of chapter 8 has the following additional 
appendices: 


Appendix 8B. Small quadratic non-residues. For a given prime p we show that 
there are small integers m and n for which (2) = land (2) = —1, and we discuss 
some of the latest developments in bounding m and n. 
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Appendix 8C. The first proof of quadratic reciprocity presents Gauss’s orig- 
inal proof of quadratic reciprocity. It is a wonderfully ingenious use of solutions 
to quadratic equations, though a little more complicated than the proofs already 
presented. 


Appendix 8D. Dirichlet characters and primes in arithmetic progressions. Here 
we present the vitally important generalization of the Legendre and Jacobi symbols 
to Dirichlet characters. To determine all of the characters themselves requires a neat 
theory. We then indicate how these were applied by Dirichlet to prove that there 
are infinitely many primes in any arithmetic progression a (mod q) with (a,q) = 1. 

Appendix 8E. Quadratic reciprocity and recurrence sequences. We study the p 
divisbility of second-order linear recurrence sequences, which depends on the values 
of certain Legendre symbols. 


Chapter 9 


Quadratic equations 


Can we tell whether a given large integer is the sum of two squares of integers (other 
than by summing every possible pair of smaller squares)? How about the values 
of other quadratics? We will show, in this chapter, how we can understand a lot 
about solutions to quadratic equations in integers, by understanding the solutions 
to those quadratic equations modulo p, for every prime p. We begin by studying the 
values taken by x? + y? when we substitute integers in for 2 and y, then ax? + by? 
for arbitrary integer coefficients a,b, and then finally the general binary quadratic 
form, ax? + bry + cy?. 


9.1. Sums of two squares 


The list of integers that are the sum of two squares of integers begins: 
0,1, 2,4, 5, 8, 9, 10, 13, 16, 17, 18, 20, 25, 26, 29, 32, 34, 36, 37, 40, 41, 45, 49, 50,.... 


Is there a pattern? Can we easily determine whether a given integer is the sum of 
two squares by any means other than trying to find two squares that sum to it? No 
pattern emerges easily from the list above so we begin focusing on the primes that 
appear in this list, namely 


2= 17417, 5=17+2?, 138 = 27437, 17 = 17447, 29 = 52427, 37 = 174+67,.... 


What do the odd primes in the list, 5, 13, 17, 29, 37, 41, 53, 61, 73, 89,97,... have in 
common? The only easy-to-spot pattern is that the differences between consecutive 
odd primes in our list, 18—5,17—13,29—17,... are all multiples of 4, which implies 
that they are all =1 (mod 4). 


Proposition 9.1.1. [fp is an odd prime that is the sum of two squares, then p = 1 
(mod 4). 
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Proof. If p = a? +b’, then p{ a, or else p|p—a? = b? so that p|b and p?|a? +b? = p, 
which is impossible. Similarly p{b. Now a? = —b? (mod p) so that 


-()-GY-B) 


and therefore p = 1 (mod 4) by Theorem 


Exercise 9.1.1. Prove that any odd integer n that can be written as the sum of two squares 
must be = 1 (mod 4). Deduce Proposition [9.1.1 


Exercise 9.1.2. Prove that if prime p divides a? + b?, then either p = 2 or p divides (a,b) or 
p=1 (mod 4). 


Remarkably this is an “if and only if ” condition: 


Theorem 9.1. Every prime p = 1 (mod 4) can be written as the sum of two 
squares (of integers). 


Proof. Since p = 1 (mod 4) we know that there exists an integer 6 such that 
b? = —1 (mod p). Consider now the set of integers 


{j+kb: 0< j,k < [Vp]}. 
The number of pairs of integers j,k used in the construction of this set is 
({\/p] + 1)? > p, and so by the pigeonhole principle, two of the numbers in the 
set must be congruent mod p; say that 

j+kb=J+Kb (mod p) 
where 0 < j,k, J, K < [,/p] and {j,k} # {J,K}. Let r=j —J ands= K—kso 
that 

r=bs (mod p) 
where |r|, |s| < [\/p] < ,/p and r and s are not both 0. Now 
r? + 8” = (bs)? +5? =87(b? +1) =0 (mod p), 


and0<r?4+s2< VP + JP = 2p. The only multiple of p between 0 and 2p is p, 
and therefore r? + s? = p. 


We will use the identity 
(9.1.1) (a? + b?)(c? + d?) = (ac — bd)? + (ad + be)? 


to determine which composite integers can be written as the sum of two squares. 
Theorem [9.1] tells us that any prime p = 1 (mod 4) can be written as the sum of 
two squares; for example 5 = 17 + 2? and 13 = 2? + 37. Then (9.1.1) yields that 
65 = 47 +77: if we write instead 13 = 3? + 2?, then we obtain 65 = 17 + 87. Indeed 
any integer that is the product of two distinct primes = 1 (mod 4) can be written 
as the sum of two squares like this, and even in two different ways. We will discuss 
the number of representations further in appendix 9C. 


Exercise 9.1.3. Find four distinct representations of 1105 = 5 x 13 x 17 as a sum of two squares. 


Exercise 9.1.4. Prove that ifn = n1---n,z where n1,...,n,% are each the sum of two squares, 
then n is the sum of two squares. 
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Theorem 9.2. Positive integer n can be written as the sum of two squares of 
integers if and only if for every prime p = 3 (mod 4) which divides n, the exact 
power of p dividing n is even. 


Proof. Suppose that n = a? +b? where g = (a,b), so we can write a = gA, b= gB, 
and n = g?N for some coprime integers A and B, with N = A? + B?. Therefore if 
p is a prime = 3 (mod 4), then p cannot divide N, by exercise and so if pln, 
then p|g. Moreover if p*||g, then p?*||n, as claimed. 

On the other hand, if n = g?m where m is squarefree, then m has no prime 
factors = 3 (mod 4) by the hypothesis. Therefore all the prime factors of m can be 
written as the sum of two squares by Theorem[9.1] and so their product, m, is the 
sum of two squares by exercise[9.L.4] say m = u2+v?. Then n = (gu)? +(gv)?. 


Exercise 9.1.5. Prove that if n is squarefree and is the sum of two squares, then every positive 
divisor of n is also the sum of two squares. 


We saw that (9.11) is a useful identity. To find such an identity let i be a 


complex number for which i? = —1. Then x? +y? = (x +iy)(x —iy), a factorization 
into numbers of the form a+ bi where a and 0b are integers. Therefore 
(a? +b7)-(@ +d) = (a+bi)(a—bi) +. (e4+di)(e—di) 
= (a+bi)(c+di) - (a — bi)(c — dt) 


= ((ac — bd) + (ad + bc)i) - ((ac — bd) — (ad + bc)i) 

- (ac — bd)? + (ad + be)?, 
and so we get (9.1.1). A different rearrangement leads to a different identity: 
(9.1.2) (a? +b?)(c? +d?) = (a+bi)(c—di)-(a—bi)(e+di) = (ac+bd)? + (ad—bc)?. 


Theorem [9.2] has the following surprising corollary: 


Exercise 9.1.6. Deduce that positive integer n can be written as the sum of two squares of 
rationals if and only if n can be written as the sum of two squares of integers. 


This suggests that we can focus, in this question, on rational solutions. In 
section [6.1] we saw how to find all solutions to 2? + y? = 1 in rationals x,y. How 
about all rational solutions to #? + y? =n? 


Proposition 9.1.2. Suppose that n = a? +b?. Then all solutions in rationals x, y 
to x2 +y? =n are given by the parametrization 
2brs + a(r? — s”) 2ars + b(s? — r?) 
= 5 = 
r2 + 82 @ r2 + 5? 


where r and s are coprime integers. 


(9.1.3) 


7 


Proof. Let x,y be any rationals for which x? + y? =n. Just as in our geometric 
proof of (6.1.1) we will parametrize these rational points (x,y) by noting that if t 
is the slope of the line between (a,b) and (x,y), then ¢ is rational, and vice versa. 
In particular we let u = 7 — a and t = (y — b)/u when u 4 0, which must both be 
rational numbers. Then 


O=n—n=(a+u)? + (b+ tu)? — (a? +) = 2u(a + bt) + u7(14+ 2), 
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so that, as u # 0, we have 

_ -2(a+ bt) — 2brs — 2as? 

1 4t2 p24 8? 
writing the rational number ¢ as t = —r/s where r and s are coprime integers. 
Substituting this value of u into x = a+u and y = b+ ut gives the claimed 
parametrization. 


If wu = 0, then x = a so that either y = b or y = —b. The line between (a,b) 
and (a, —b) is the vertical line « = a (corresponding to r = 1, s = 0 so that t = ov). 
Finally we obtain the initial point (a,b) in this parametrization by taking r = 
a,s = b. This is obtained by taking the slope to be t = —a/b, the slope of the 
tangent line to the curve x? + y? =n at the point (a,b). 


In Theorem [9.1] we saw that every prime p= 1 (mod 4) can be written as the 
sum of two squares. Examples suggest that there is a unique such representation, 
up to signs and changing the order of the squares, as the reader will now prove: 
Exercise 9.1.7.1 Suppose that prime p = a? 4+ b?. 

(a) Prove that |aJ, |b] < \/p. 
(b) Prove that if r? = —1 (mod p), then either r = a/b (mod p) or r = b/a (mod p). 
(c) If prime p divides c? +d? but p{ cd, show that p divides either ac— bd or ad—bc, and deduce 
that p divides both terms on the right-hand side of either (9.1.1) or (9.1.2), respectively. 
(d) Suppose that p = a? + b? = c? + d? where a,b, c,d > 0. Show that {a,b} = {c,d}. 
In other words, we have proved that each prime = 1 (mod 4) has a unique representation as the 


sum of two squares, unique up to changing the order of the squares, or their signs. 


Exercise 9.1.8.1 Prove, using the method of Theorem [9.1] that a squarefree integer n can be 
written as the sum of two squares if and only if —1 is a square mod n. 


9.2. The values of x? + dy? 


What values does x? + 2y? take? Let’s start again with the prime values: 

2,3, 11,17, 19, 41, 43, 59, 67, 73, 83, 89,97,.... 
There is no obvious pattern; but this list contains exactly the same odd primes that 
we found in section[8.4] when exploring when (=) = 1. This link is no coincldence 


for if we suppose that odd prime p = x? + 2y?, then p does not divide x or y and so 


BO gece ge Cie 
P P P p)\p pj 
From and (8.7.2), we know that (=2) = 1 if and only if p= 1 or 3 (mod 8). 


On the other hand if (—2/p) = 1, then select b (mod p) such that b? = —2 
(mod p). We take R = Jp, S= 2-0 Jp in exercise [9.7.3] so that there exist 
integers r and s, not both 0, with |r| < R and |s| < 9, for which p divides r? + 2s”. 
Therefore 0 < r? + 2s? < 2°/2p < 3p, and so r? + 2s? = p or 2p. In the latter case, 
2 divides 2p — 2s? = r? so that 2|r. Writing r = 2R we have s? + 2R? = p. Hence, 
either way, p can be written in the form m? + 2n?. Therefore we have proved: 


Theorem 9.3. Odd prime p can be written in the form m? + 2n? if and only if 
p=1 or3 (mod 8). 
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The identity 
(a? + 2b7)(c? + 2d?) = (ac + 2bd)* + 2(ad — bc)? 


is analogous to (9.1.1). Using this, one can prove, analogous to the proof for u? +v? 
in the first half of section [9.1] that positive integer n can be written as r? + 2s? if 
and only if for every prime p = 5 or 7 (mod 8) which divides n, the exact power of 
p dividing n is even. 

Can we also modify this proof for values of 7? +3y?? Or 2? + 5y?? We explore 
this in the following exercises. 


Exercise 9.2.1. Fix integer d > 1. Give an identity showing that the product of two integers of 
the form a? + db? is also of this form. 


Exercise 9.2.2. Which primes are of the form a? + 3b?? Which integers? 


Exercise 9.2.3. Which primes are of the form a? + 5b?? Try listing what primes are represented 
and compare the list with the set of primes p for which (—5/p) = 1. 


9.3. Is there a solution to a given quadratic equation? 


It is easy to see that there do not exist non-zero integers a, b,c such that a? +5b? = 
3c’, for, if we take the smallest non-zero solution, then we have 


a” =3c? (mod 5) 


which implies that a = c = 0 (mod 5) since (3/5) = —1, and so b = 0 (mod 5). 
Therefore a/5,b/5,c/5 gives a smaller solution to x? + 5y? = 3z?, contradicting 
minimality. 

Another proof stems from looking at the equation mod 4 since then a?+b?+c? = 
0 (mod 4), and 0 and 1 are the only squares mod 4. Therefore if three squares sum 
to an integer that is 0 (mod 4), then they must all be even. But then a/2, b/2,c/2 
gives a smaller solution, contradicting minimality. 


So we have now presented two different proofs that there are no non-zero solu- 
tions in integers to a? + 5b? = 3c?, by working with two different moduli. 


For all quadratic equations in three or more variables with real solutions, there 
is never just one prime or prime power modulo which there are no solutions to the 
given equation—when there is one, there is always a second. And indeed when 
there is a third proof, then there is always a fourth. A remarkable consequence 
of the theory (see appendix 9B) is that if a given quadratic equation in three or 
more variables has non-zero real solutions but no non-zero integer solutions, then 
there are always exactly an even number of different primes p such that the given 
equation has no non-trivial solutions mod p* for some k > 1. Moreover the odd 
primes involved must divide the coefficients of the equation. On the other hand, if 
there are no such “mod p* obstructions”, then there must be at least one non-zero 
integer solution (implying that there must be a real solution!). 


In exercise[3.6.4]we proved that there are integer solutions (m,n) to am-+bn = c 
if and only if there are solutions u,v (mod b) to au+bv =c (mod b). Similarly we 
will show that if a, b, and c are pairwise coprime, positive integers, then there are 
rational solutions (x,y) to ax? + by? = c if and only if there are coprime solutions 
u,v (mod 4abc) to au? + bv? = c (mod 4abc). This is an amazing theorem since 


178 9. Quadratic equations 


to determine whether a quadratic equation has solutions in rationals we need only 
verify whether it has solutions modulo a finite modulus. 


To work on rational solutions (x,y) to ax? + by? = c it is convenient to develop 
this into a question about integer solutions and to manipulate the equation to a 
more convenient form: 


(i) We may assume that each of a,b,c is a squarefree integer or else, if, say, 
a = p’A, the rational solutions to ax? + by? = ¢ are in 1-to-1 correspondence 
with those of AX? + by? =c, taking X = pz. If b is divisible by a square, we 
proceed analogously. If c = q?C, then the rational solutions to ax? + by? =c 
are in 1-to-1 correspondence with those of aX? + bY? = C, taking X = x/q 
and Y = y/q. 

(ii) We may assume that a,b,c are pairwise coprime or else if, say, a = pA and 
b = pB, then AX? + BY? = C with X = pz, Y = py, and C = pe; and if 
a=qA and c= qC, then Az? + BY? =C with B = bq and Y = y/q. 

(iii) Letting n be the lowest common denominator of the rationals x and y, we 
write x = ¢/n with y = m/n so that ¢,m,n are integers with (¢€,m,n) = 1 
and af? + bm? = en?. 

(iv) We may assume that al?,bm?,cn? are pairwise coprime. If not, suppose that 
prime p divides af? and bm?, so that p divides al? + bm? = cn?. Now p 
can only divide one of a,b,c (since they are pairwise coprime), say, c, and 
so must divide ? and m?. But then p divides £ and m, and so p? divides 
al? +bm? = cn?. Hence p divides n, as p? { c, contradicting that (€,m,n) = 1. 


Therefore the correct formulation of our result is as follows: 


Theorem 9.4 (The local-global principle for quadratic equations). Let a, b, 
and c be given pairwise coprime, squarefree integers. There are solutions in 


Non-zero integers €,m,n to al? + bm? + cn? = 0 with (al?, bm?) = 1 
if and only if there are solutions in 
Non-zero real numbers X, ,v to ad? + bu? + cv? = 0, 
and, for all positive integers r, there exist 
Residue classes u,v,w (mod r) for which au? + bv? + cw? =0 (mod r), 


with (au?, bv?,cw?,r) = 1. 


Proof =>: We may take \=u=@, pw=v=m, v=w =n throughout. 


The proof in the other direction is the difficult part; it follows along the lines 
of the proof of Theorem [9.1] but is more complicated. In appendix 9a we rephrase 
that proof in the language of lattices, before completing the proof of the local-global 
principle. 

We can reduce the set of moduli to be considered using the following lemma. 


Lemma 9.3.1. Let a,b,c be given pairwise coprime, squarefree integers. There are 
residue classes u,v,w (mod r) with (au?, bv?,cw?,r) =1 for which 


au” + bv? +ew? =0 (mod r) 
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for every positive integer r, if and only if there are such solutions for r = 8, and 
for r =p for every odd prime p dividing abc. 


This result implies that, as in exercise |3.6.4| we can restrict our attention in 
Theorem [9.4] to just one modulus, namely r = 8|abc|. 


Proof. We can restrict our attention to prime power moduli p* by the Chinese 
Reminder Theorem. We will prove that there are such appropriate solutions mod 
p* by induction on k: for k > 1 when p is odd and for k > 3 when p = 2. There are 
appropriate solutions modulo every odd prime p and modulo 2?, by the hypothesis 
for primes p dividing 2abc, and by exercise [8.9.9] for all odd primes p that do not 
divide abc. 


So now assume we have an appropriate solution mod p*, so that p does not 
divide at least one of au”, bv?,cw?, say, au? (and an analogous argument works 
if p does not divide one of the others). Let R = —a~‘(bv? + cw), so that 
u2 = R (mod p*) by the induction hypothesis. By Proposition there ex- 
ists U (mod p**1) for which U? = R (mod p**") so that aU? + bv? + ew? = 0 
(mod p*t!) and (U,p) = 1. 


Now if au? +bv?+cw? = 0 (mod a) with (a, bv?, ew?) = 1, then —be = (cw/v)? 
(mod a); that is, —bc is a square (mod p) for every prime dividing a. Making 
similar remarks modulo b and c, we find Legendre’s formulation of the local-global 
principle[}| 


Theorem 9.5 (Legendre’s local-global principle, 1785). Let a,b,c be given 

pairwise coprime, squarefree integers which do not all have the same sign. There 
are solutions in non-zero integers €,m,n to al? + bm? + cn? = 0 if and only if 
—ab is a square mod |c|, —ac is a square mod |b|, and —bc is a square mod |a|. 


Note that af? + bm? + cn? = 0 has solutions in non-zero reals if and only if 
a,b,c do not all have the same sign. 


This principle may be extended to the rational solutions of more or less any 
quadratic equation: Any quadratic polynomial in n variables can be diagonalized; 
that is, a linear change of variables can change the polynomial into a diagonal 
quadratic polynomial. We know that in the example g = ax? + bry + cy? we can 
let X = x + by/2a and then g = aX? + Dy? where D = —(b? — 4ac)/4a. In a 
three-variable example we take the polynomial 


f =a? + Qay + Baz + 47? + Bye + 62? + Tox + By + 9z + 10; 


we let X = x+y+2z2+4 replace x to obtain f = X?+3y?+2yzt 22? +y— 32-2. 
Then letting Y = y+ 4+ % we obtain f = X?+3Y?+ 42? —- 42-2, and if 


12 6 3) 
z2=624+ iD this becomes 


423 


F=X?+3Y? 41237? 
164’ 


1The careful reader will note that we do not seem to have made adequate remarks about the 
solution modulo powers of 2. However, we noted earlier in this section that if there are solutions in the 
reals and modulo all but one prime, then there is a solution modulo all powers of this last prime. For 
more details see appendix 9B. 
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a diagonal quadratic with no “cross terms” (like XY). Notice that the rational 
solutions to F'(X,Y,Z) = 0 are in 1-to-1 correspondence with the rational solutions 
to f(x,y, z) =0. 

Whether or not a given diagonal quadratic with three or more terms has rational 
solutions can then be resolved by the local-global principle? 
Exercise 9.3.1. Given one integer solution to ax? + bye + cz = 0, show that all other integer 
solutions to ax? + by? + cz? = 0 are given by the parametrization 


x:y:z = (ar? —bs?)x9 + 2brsyo : 2arsao — (ar? — bs?)yo : (ar? + bs?)zo . 


9.4. Representation of integers by ax? + by? with x,y rational, and 
beyond 


Coprime integer solutions to au? + bv? = cw? with w > 0 are in 1-to-1 correspon- 
dence with the rational solutions to ax? +by? = c, by taking x = u/w and y = v/w. 
Therefore the local-global principle can be restated to give an “if and only if” cri- 
terion to determine whether c can be written as ax? + by? with x and y rational. 
This is most usefully modified as follows: 


Corollary 9.4.1. Suppose that a,b,c are given integers with (a,b,c) = 1, and 
suppose d = b? — 4ac is not divisible by the square of any odd prime. For any 
given squarefree integer N with (N,d) = 1, there exist rationals u and v for which 
N = au? + buv + cv? if and only if the following criteria hold: 


e N has the same sign as a orc, ord > 0; 


e dis a square mod N; 


e (“) = (4) for all odd primes p dividing d that do not divide a; 
e (2) = (<) for all odd primes p dividing both d and a. 
Proof. If N = au? + buv + cv?, then we multiply through by 4a to obtain 4aN = 
(2au + bv)? — dv”; in other words, aN = U? — dV? for some rationals U,V. We 
may reverse this argument, and so there exist rationals u and v for which N = 
au? + buv + cv? if and only if there exist rationals U,V for which aN = U? — dV?. 
We now apply Legendre’s version of the local-global principle to rational solutions 
to the equation aN = u? — dv?. 

We have real solutions if and only if aN > 0 ord> 0. 


Now U? = dV? (mod aN) and so d must be a square mod aN. But d = 
b? — 4dac = b? (mod a), so we need only verify that d is a square mod N. 


If odd prime p divides d, then aN = u? (mod p), and so (2) = (3) if p does 


not divide a. 

If odd prime p divides both d and a, then it divides }, as it divides b? = d+4ac. 
Therefore p does not divide c as (a,b,c) = 1. We then run through the analogous 
argument with a replaced by c. (For the primes p dividing d, but not 4ac, our 


results that (2) = (2) and (2) = (<) are consistent; see exercise [8.1.4]) 


Pp Pp 


? Which we have only proved in three variables but is true in three or more variables. 
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9.5. The failure of the local-global principle for quadratic 
equations in integers 


We have seen how the local-global principle allows us to determine whether there 
are rational solutions x,y to a given equation of the form ax? + by? = c. However 
we will now show that it does not help when we ask for integer solutions. The 
example 


x” + 23y? = 52 
has rational solutions, like (5, 3), (2, 3). (2, is). . There are obviously no in- 


teger solutions or else 23y? < x? + 23y? = 52 and so y? = 0 or 1, but then 
x? = 52—23y? = 52 or 29, which are not squares. Since there are rational solutions 
we know that there are non-trivial solutions to a? + 23b? = 52c? (mod p*) for all 
prime powers p* by the local-global principle, but not necessarily to a? + 23b? = 52 
(mod p*). To prove that there are such solutions, we show that solutions exist 
modulo 8 and all odd prime moduli p, and then we lift these solutions to all prime 
power moduli p*, using Proposition B.8.1 


We have the solutions 2? + 23. 4? = 372 =52 (mod 8), 47+ 23-17 =39= 52 
(mod 13), and 11? + 23-0? = 121 = 52 (mod 23). For any odd prime p other than 
13 or 23, there are ptt residues mod p of the form 23y?, and uel residues mod p 
of the form 52 — x”, so two of these residues must be equal. Therefore there is a 
solution to x? + 23y? = 52 (mod p), and evidently one of z and y must be non-zero 
(mod p) (or else p would divide 52). 


Therefore we have shown that the local-global principle holds for integer and 
rational solutions of linear equations, and for rational but not integer solutions of 
quadratic equations. However it does not even hold for rational solutions of cubic 
equations: In 1957, Selmer showed that 32° + 4y? = 5 has solutions in the reals, 
and mod r for all r > 1, yet has no rational solutions. Further discussion of the 
failure of the local-global principle for cubic equations can be found in |Grab], with 
a motivating discussion in chapter 7. 


9.6. Primes represented by x? + 5y? 


Calculations reveal that the primes > 5 that are represented by x? + 5y? are 
29, 41,61, 89,101,109, 149,181,.... 


From our explorations of the binary quadratic forms x? + y?, 27+ 2y?, and x? +3y? 
we might guess that this should be the set of primes for which (—5/p) = 1. However 
the list of primes for which (—5/p) = 1 also includes the primes 


3, 7, 23, 43, 47, 67, 83, 103, 107, 127, 163, 167,.... 


What is going on? We quickly see that the primes in the first list end in a 1 or a 
9, whereas the primes in the second list end in a 3 or a 7, so there seems to be a 
further congruence condition that partitions the list. Further examination of the 
equation p = x” + 5y” makes this evident: Besides (—5/p) = 1, we can also deduce 
that p= 2? (mod 5) so that (p/5) = 1. Combined with (—5/p) = 1, this also yields 
that p= 1 (mod 4). These two conditions together give that p = 1 or 9 (mod 20), 
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the primes that we see in the first list, and if (p/5) = —1, then we obtain p = 3 or 
7 (mod 20), the primes that we see in the second list. 


Where do the primes in the second list come from? It turns out there is a 
second, fundamentally different binary quadratic form, 2x? + 27y + 3y?, which has 
the same discriminant —20 as x? +5y?. We first observe that these quadratic forms 
definitely do not represent the same integers because 2? + 2ay + 3y? represents 3, 
whereas x? + 5y? evidently does not. A quick calculation reveals that the second list 
is precisely the set of odd primes represented by 2x? + 2xy + 3y?. This dichotomy 
will be explored further in chapter 12, though we observe here that if prime p = 
2x? + Qry + 3y, then 2p = 4x? + 4zy + Gy? = (22 + y)* + By’; that is, 2p can be 
represented by a? + 5b? 

In general if we wish to represent the odd prime p by x? + dy?, then —d must 
be a square mod p. On the other hand, suppose that —d is a square mod p, say 
u? = —d (mod p) with |u| < p/2. 

If p < 2Vd, then we can write u2 + d = ap, so the binary quadratic form, 
pm? +2umn-+an?, has discriminant —4d, the same as x? +dy?, and takes the value 
p when m=1,n=0. 

Now assume that p > 2Vd. By exercise B-73(a) with R = d'/4\/p,S = 
d-V/4, fp, there exist integers r and s, not both 0, for which r = us (mod p) and 
so, squaring, r? = —ds? (mod p); that is, r? + ds? is a multiple of p. Moreover we 
have 0 < r?+ ds? < R? + dS? = 2Vdp. Therefore there exists an integer a in the 
range l<a< 2\/d for which 


r? + ds” = ap. 


We may assume that (r,s) = 1 for if g = (r,s), then we claim that g? divides 
a, so we can divide r and s through by g. To justify our claim, note that g? 
divides r? + ds? = ap so if g? does not divide a, then p divides g. But then 
p< g? <r? +ds? = ap and so p < a < 2Vd, a contradiction. 

Now (s,a) = 1 or else if prime q divides a and s, then it divides ap = —ds? = r?, 
and so it divides r, contradicting that (r,s) = 1. Let 6 be an integer for which b= 
r/s (mod a) so that 6? = —d (mod a). We define integers n = s, m = (r — bs)/a, 
and c = (b? +d)/a. This implies that am + bn =r and so 

(am + bn)? — (b? —ac)n?__ rr? 4+. ds? 


am? + 2bmn + cn? = — =: 
a a 


Therefore, whenever —d is a square mod p, there is a quadratic equation in 
two variables, with positive leading coefficient < 2\/d, and of discriminant —4d, 
which takes the value p. This is the first hint of a general theory: We will study 
the solutions to quadratic equations in two variables, like this, in detail, in chapter 
12. 


Additional exercises 


Exercise 9.7.1. Let f(n) be the arithmetic function for which f(n) = 1 if n can be written as 
the sum of two squares, and f(n) = 0 otherwise. Prove that f(n) is a multiplicative function. 
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Exercise 9.7.2. Let p be a prime = 1 (mod 4). This exercise yields another proof that p is the 
sum of two squares. 
(a) Use Theorem [8.3]to prove that there exist integers a and b such that a? + b? is a positive 
multiple of p. 
(b) Let rp be the smallest such multiple of p. Prove that r < p/2. 
(c)t Prove that if r > 1, then there exists a positive integer s < r/2 such that rs = c? +d? for 
some integers c and d, selected so that ad — bc is divisible by r. 
(d) Use to deduce that if r > 1, then sp is a sum of two squares. 


This contradicts the minimality of r unless r = 1; that is, p is the sum of two squares. 


Exercise 9.7.3. Let p be an odd prime. 
(a)? Suppose that b (mod p) is given and that R,S > 1 such that RS = p. Prove that there 
exist integers r,s with |r| < R,0 <s< S such that b=r/s (mod p). 
(b) Prove that there exists an integer m with |m| < \/p for which (2) =I: 
(c) Deduce that if p = 1 (mod 4), then there exists an integer n in the range 1 < n < \/p for 


which (2) =f. 


Exercise 9.7.4. Show that x and y are integers in (9.1.3) if and only if r? +s? divides 2(ar +s), 
and show that this can only happen if r? + s? divides 2n. 


Exercise 9.7.5. What values of r and s yield the point (—a, —b) in Proposition[9.1.2) 
Exercise 9.7.6. Reprove exercise[9.1.8] using Theorem[9.JJand (9.1.1). 


Exercise 9.7.7.1 33? + 562 = 65? and 16? + 632 = 65? are examples of the side lengths of 
different primitive Pythagorean triangles with the same hypotenuse. Classify those integers that 
appear as the hypotenuse of at least two different primitive Pythagorean triangles. 


Exercise 9.7.8. Prove that for every integer m there exists an integer n which is the length of 
the hypoteneuse of at least m different primitive Pythagorean triples. (You may use Theorem[7.4] 
which implies that there are infinitely many primes = 1 (mod 4).) 


Exercise 9.7.9.1 Prove that an integer of the form a? + 4b? with (a, 2b) = 1 cannot be divisible 


by any integer of the form m? — 2 with m > 1, or m? +2. Conversely prove that an integer of 

the form m? — 2n? or m? + 2n? with (m,2n) = 1 cannot be divisible by any integer of the form 
2 

a* +4. 


Exercise 9.7.10.! (Zagier’s proof that every prime = 1 (mod 4) is the sum of two squares) Let 
Sis {(2,y, z) € Ne 7 p= x + 4yz}. 
Define the map ¢: S > S by 
(w+2z,z,y-a2-—2z) ifa<y-z, 
: (a,y, 2) > (2y TY, ys z) if y Z< aU < 2y, 
(w@—2Qy,c-—y+z,y) ifa>2y. 
(a) Show that ¢ is an involution, that is, 6? = 1, and verify that each ¢(S) belongs to S. 
(b) Prove that if d(v) = v, then v = (1,1, p+). 
(c) Deduce that there are an odd number of elements of S' (in particular, S is non-empty). 
Let w: S — S be the involution ~(z, y, z) = (x, z,y). 


) Prove that w has a fixed point (x,y, y) so that z = y. 
(e) Deduce that p = «7 + (2y)? for some integers 2, y. 


Appendix 9A. Proof of 
the local-global principle 
for quadratic equations 


In this appendix we will give the difficult part of the proof of the local-global 
principle for quadratic equations, Theorem as discussed at length in section 


The local-global principle for quadratic equations. Let a,b,c be given 
pairwise coprime, squarefree integers. There are solutions in 


non-zero integers €,m,n to al? +bm?+cn?=0 with (al?,bm?) =1 
if and only if there are solutions in 
non-zero real numbers , pl, to ad? + bu? +c? = 0, 
and, for all positive integers r, there exist 
residue classes u,v,w (mod r) for which au? +bv?+cw?=0 (mod 1), 
with (au?, bv?, cw?,r) = 1. 


Our proof depends on an understanding of lattices. 


9.8. Lattices and quotients 


A lattice A in R” is the set of points obtained by integer linear combinations of n 
given linearly independent vectors. If the basis is 21, 22,...,%p) € R”, then 


A= {myxy + mat%q +--+ + MyXn 2 M1,M2,...,Mn € Zh. 


One can see that A is an additive group, but it also has some geometry connected 
to it. The fundamental domain of A with respect to 71, 72,...,% is the set 


P= P(A) := {a1a1 + dg%q +++ + Ant: 0< a; < 1}, 
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the interior (and part of the boundary) of one of the diamond-shaped cells in Figure 
9.1. If \ € A, then \+ P gives us another of the diamond shapes, shifted from the 
original by \. Therefore the sets 4+ P, A € A are disjoint and their union is R”. 
Therefore P(A) is a set of representatives of 


R"/A, 


which is often called “R” mod A”. 


Figure 9.1. Constructing a lattice in R?, generated by vectors u and v. The 
shaded grey parallelogram is the fundamental domain P(A). The dots repre- 
sent the same point in R?/A repeated in each copy of P(A); that is, they are 
the points P + A for each vector  € A. 


In the non-trivial example with n = 1, for which A = Z, we can write every 
real number z as m+ a where m € Z and a € [0,1), letting m = [z] and a = {z}. 
We prefer to think of this as z = a in the ring R/Z since their difference, m, is an 
integer. This generalizes to n dimensions, in which case we can identify R"/A with 
(R/Z)”. 

The determinant det(A) of A is the volume of P; in fact det(A) = |det(A)], 
where A is the matrix with column vectors x1, %2,...,2%n (written as vectors in 
R"). A convex body K is a bounded convex open subset] of R”. 


’These are all common terms in geometry. A set S C R” is bounded if it can be contained inside 
a ball of some finite radius. The set S is convex if all the points on the straight line between any two 
points of S also belong to S. The set S is open if there is a ball around any given point of S, perhaps 
of very small radius, that also is contained within S. 
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If Ac Z”, then there are det(A) cosets of A in Z”; that is, 
|Z” /A| = det(A). 
In the proof of Theorem [9.1] we work with the lattice 
A:={(r,s)€Z?: r—ks=0 (mod p)} 


(where k? = —1 (mod p)). This lattice is presented there somewhat differently 
from the definition here, but it can easily be seen that A is generated by (k,1) and 
(p,0), and that (0,p) = p(k, 1) —k(p,0). Hence det(A) = p; in particular we deduce 
that there are p distinct cosets of A within Z?. 

Let S be the set constructed in the proof of Theorem[9.]} S is a convex set of 
> p elements of Z? so that the difference, d, of two of them lies on the lattice A. 
The set S was constructed so that the difference, d, must lie close to the origin. 
Moreover A was constructed so that if (r,s) € A, then r? + s? = 0 (mod p) (since 
if r= ks (mod p), then r? + s? = (ks)? + s? = (k? +1)s? =0 (mod p).) 

We will now develop these ideas to give a proof of the local-global principle. In 
the next section we will modify the last step to make it more elegant. 


Proof of the local-global principle. Assume that a, b, and c are squarefree, 
pairwise coprime integers, with a,b > 0 > c (so that there are non-zero real solutions 
to ax? + by? + cz? = 0), and that there exists a solution to 

au? + bv? + cw =0 (mod |abel), 


with (au?, bv?, cw?, abc) = 1/4] We may assume that at least two of a,b, \c| are > 1, 
for the case a = b = 1 can be proved directly from Theorem [9-2] while the case 
a= 1,c = —1 is easy as we always have the solution «= b—1l,y=2,z=6+1. 
Define the lattice 
A:={(z,y,z) CZ: aux +bvy+cwz=0 (mod |abcl)}. 
We claim that if (x,y,z) € A, then 
ax? + by? +cz?=0 (mod |abcl). 
We now prove that this holds mod a (and the cases mod b and mod |c| proceed 
analogously, so that the claim follows using the Chinese Remainder Theorem). Now 
if (a, y,z) € A, then buy = —cwz (mod a), and so 
bv? - by? = (buy)? = (—cwz)? = cw? - cz? (mod a). 
Dividing through by bv? = —cw? (mod a), we deduce that by? = —cz? (mod a). 
Therefore ax? + by? + cz? =0 (mod a), as desired. 


In the next exercise we will show that |det(A)| = |abc|. Let 


S:={(i,j,k): O<i<[ylbel], 0< 7 < [Vac], 0< k < [V|ad]]}. 


The number of integer points in S is > \/|bc] - \/Jac| - \/ab] = |abe| = |Z°/A], and 
so, by the pigeonhole principle, there must be two lattice points in S that differ by 


4*Lemma [9.3.1 implies that we should work modulo 8|abc| in proving the local-global principle. 
However, in this first version of our proof, we prefer to not worry about the equation modulo powers of 
2. We will revisit this issue in the next section. 
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non-zero element (x,y,z) € A. If the two lattice points are (i,j,k) and (J, J, Kk), 
then 


lzl =| 21 < [Vibe], wl =18 - JS [Weel], lal = 1k - KS [Va]. 


These are all “<” as none of |bc|, |ac|, |ab| are squares, since at least two of a, b, |c| are 
> 1 and they are pairwise coprime. Therefore ax? + by? < 2\abc| and |cz?| < |abel, 
so that 

—labc| < ax? + by? + cz? < 2labc|. 
This implies that either ax? + by? + cz? = 0 or az? + by? + cz? = |abc| = —abc. 
We need to eliminate the second case. I know of two ways to do this. The first is 


inelegant and comes from simply noting that if ax? + by? + cz? + abc = 0, then 


a(azz — by)? + b(ax + yz)” + c(ab + 27)? = (abt 27) (ax? + by? + cz” + abc) = 0. 


The second involves slightly modifying the definition of A, by taking the prime 2 
into account more carefully, which we discuss in the next section. 


Exercise 9.8.1. (a) Show that there exist integers U, V, W, coprime with abc, for which U = u 
(mod be), V =v (mod ac), W = w (mod ab), so that aU? + bV? + cW? = 0 (mod |abc]). 
(b) Let U-1 be an integer = 1/U (mod abc) and W~! be an integer = 1/W (mod abc). Show 
that A is generated by the vectors (1,VU~!,WU~—‘), (0,c, -bVW—*), and (0,0, ab).) 
(c) Deduce that det(A) = |abc]. 


9.9. A better proof of the local-global principle 
The idea is to construct a lattice, based on that in the previous section, but now 
of determinant 4|abc|. We begin by defining 
Ao := {(x,y,2) CZ: aux + buy +ewz =0 (mod |abc|)}. 
If c is even, then let 
A:={(a,y,z)€ Ao: y=a (mod 4) and z=wez (mod 2)} 
based on the given solution (u,v,w). We construct A analogously if a or 0 is even. 


If abc is odd, then one of u,v, w must be even (as au? + bv? + cw? = 0), say w. 
If so, then let 


A:={(a,y,z)€ Ao: y=ax (mod 2) andz=0 (mod 2)}, 
using the given solution (u,v,w). We construct A analogously if u or v is even. 
Exercise 9.9.1. (a) Prove that if (2, y,z) € A, then ax? + by? + cz? = 0 (mod 4{abel). 
(b) Prove that det(A) = 4|abc]. 


Consider the set of integer points 


S:={(i,j,k): 0O<i< [V2be]], 0< 5 < [Vlad], 0< k < [2V/Jab]}. 


The number of lattice points in S is > \/2|bc] - \/2|ac] - 2./\ab] = 4|abe| = |Z3/A| 
by exercise [9.9.1(b), and so, by the pigeonhole principle, there must be two lattice 
points in S that differ by a non-zero element (x,y,z) € A. If the two lattice points 
are (i,j,k) and (J, J, K), then 


lz] =|é- 2) <[V2lbel], lvl = li - J] < [V2lael], [2] = |k - K| < 2V/la8]]. 
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Therefore ax? + by? < 4Jabe| and |cz| < 4|abc| (as equality would only be possible 
if a =b=1), and so 
lax? + by? + cz| < Alabc. 
Now, since (x,y,z) € A, we know that 
az” + by? +cz=0 (mod 4labc]), 


by exercise [9.9.1[a), and so we must have ax? + by? + cz? = 0 as desired. 


A by-product of this proof is that the smallest non-trivial solution satisfies 
|al?|, |bm?|, |en?| < label. 
In 1950, Holzer showed that one may replace 4|abc| by |abcl. 


Exercise 9.9.2. Give infinitely many examples in which max{|a¢?|, |bm?|, |cn?|} = |abc| in the 
smallest non-trivial solution of af? + bm? + cn? = 0. 


Appendices. The extended version of chapter 9 has the following additional 
appendices: 

Appendix 9B. Reformulation of the local-global principle. We introduce the 
Hilbert symbol and go on to formulate and prove the Hasse-Minkowski principle, 
the local-global principle for quadratics in n variables with n > 3. 

Appendix 9C. The number of representations studies how often an integer is 
the sum of two squares and uses this to introduce some important formulas. 

Appendix 9D. Descent and the quadratics introduces several famous questions 
which require descent and can be analyzed through matrix actions and orbits, 
including the beautiful question of tiling a circle with smaller circles. 


OOO 
Chapter 10 


Square roots and factoring 


In this chapter we will study the computational side of number theory, which plays 
an important role in several uses of computers in today’s society, particularly when 
it comes to keeping secrets. We will investigate how to rapidly determine whether 
a given large integer is prime and, if not, how to factor it. The issue of factoring 
an integer n is closely related to determining square roots mod n: 


10.1. Square roots modulo n 


How difficult is it to find square roots mod n? The first question to ask is how 
many square roots does a square have mod n? 


Lemma 10.1.1. If n is an odd integer with k prime factors and A is a square mod 
n with (A,n) = 1, then there are exactly 2* residues mod n whose square is = A 
(mod n). 


In particular, all squares mod m, that are coprime to m, have the same number of 
square roots mod m. We resolved how many square roots 1 (mod n) has in Lemma 
[3.8.1] and here we modify that proof to better suit the discussion in this chapter. 
We could have immediately deduced Lemmaf[I0.1.1]for if A is a square mod n, then 
there exists b (mod n) such that b? = A (mod n), and then the solutions to x? = A 
(mod n) are in 1-to-1 correspondence with the solutions to y? = 1 (mod n) through 
the invertible transformation x = by (mod n). 


Proof. Suppose that b? = A (mod n) where n = p{'p5? ...p;*, and each p; is odd 
and distinct. If z? = A (mod n), then n|(x? —b?) = (x — b)(a +6) so that p divides 
x— bor «+6 for each prime p dividing n. Now p cannot divide both or else p 
divides (a + b) — (x — b) = 2b and so 4A = (2b)? = 0 (mod p), which contradicts 
the fact that (p,2A)|(n,2A) = 1. So let 


d=(n,x— 6), and therefore n/d=(n,x2+)), 
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which must be coprime. Then 2 = bg (mod n) where bg is that unique residue class 
mod n for which 


b dd 
(10.1.1) by = madd 
—b (mod n/d). 
Note that the bg are well-defined by the Chinese Remainder Theorem, are distinct, 
and that 2? = b? = b? = A (mod n) for each d. 
The possible values of d are [],-; p;* for each subset J of {1,...,k}, and there- 
fore there are 2" possibilities. 


To see how the proof works let’s obtain the four square roots of 4 (mod 15) from 
knowing one square root, 2, and the factorization of 15. These four square roots 
are given by four pairs of congruences which we solve using the Chinese Remainder 
Theorem: 


2 (mod1) and -—2 (mod15) which yield 13 (mod 15); 

2 (mod3) and -—2 (mod5) ~~ which yield 8 (mod 15); 
2 (mod5) and -—2 (mod3) ~~ which yield 7 (mod 15); and 
2 ( 


mod 15) and -—2 (mod1)~ whichyield 2 (mod 15). 


Consequence. Let n be an odd integer with at least two different prime factors, 
and suppose that b? = A (mod n) with (A,n) = 1. Finding square roots of A mod 
n, other than b and —, is “as difficult as” factoring n into two parts both > 1. 


Sketch of “proof”. If we have a factorization n = d-n/d, then we select bg as in 
(10.1.1) so that b%2 = A (mod n) but bg 4 +b (mod n), as d,n/d> 1. 


In the other direction, suppose that one has a fast algorithm for rapidly finding 
arbitrary square roots mod n for odd integers n. In particular given A (mod n), 
the algorithm randomly determines some x (mod n) for which x? = A (mod n); 
by “random” we mean that each time the “square root finding” algorithm is run it 
is equally likely to produce any one of the 2” solutions (as in Lemma[I0.1.1). Now 
define d = (n, x — 6) (as in the proof of Lemma[I0.L.J) and so we factor n as d-n/d. 
This works provided d 4 1 or n, that is, provided that « 4 b or —b (mod n). 

Now, the probability that 2 = 6b or —b (mod n) is 2/2" which is < $ as k > 2. 
Therefore the probability of finding a non-trivial factor of n each time the “square 

1 


root finding” algorithm is run is > 5. This does not seem persuasive, but if we 


run the “square root finding” algorithm 20 times, then the probability that the 


: : : 20 a : tas 
algorithm gives 1 or n on every run is < (5) , which is less than one in a million. 


So, in practice, we will quickly find a non-trivial factor of n. 


We have shown that finding square roots mod n and factoring n are more or 
less equally difficult problems. 


Exercise 10.1.1. Find all of the square roots of 49 mod 32-5- 11. 
10.2. Cryptosystems 


Cryptography has been around for as long as the need to communicate secrets at a 
distance. Julius Caesar, on campaign, communicated military messages by creating 
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ciphertext from plaintext (the unencrypted message), replacing each letter of the 
plaintext with that letter which is three letters further on in the alphabet. Thus A 
becomes D, B becomes E, etc. For example, 


thisisveryinteresting 
becomes 
wklilvilvyhublqwhuhvwilqj 


(Y became B, since we wrap around to the beginning of the alphabet. It is es- 
sentially the map x > x + 3 (mod 26).) At first sight an enemy might regard 
WKLV...WLQJ as gibberish even if the message was intercepted. It is easy 
enough to decrypt the ciphertext, simply by going back three places in the alpha- 
bet for each letter, to reconstruct the original message. The enemy could easily 
do this if (s)he guessed that the key is to rotate the letters by three places in the 
alphabet, or even if they only guessed that one rotates by a fixed number of let- 
ters, as there would only be 25 possibilities to try. So in classical cryptography it 
is essential to keep the key secret, as well as the technique by which the key was 
created |}| 


One can generalize to arbitrary substitution ciphers where one replaces the 
alphabet by some permutation of the alphabet. There are 26! permutations of our 
alphabet, which is around 4 x 107° possibilities, enough one might think to be safe. 
And it would be if the enemy went through each possibility, one at a time. However 
the clever cryptographer will look for patterns in the ciphertext. In the above short 
ciphertext we see that DL appears four times among the 21 letters, and H, V, W three 
times each, so it is likely that these letters each represent one of A, E,1,5,T. By 
looking for multiword combinations (like the ciphertext for THE) one can quickly 
break any ciphertext of around one hundred letters. 


To combat this, armies in the First World War used longer cryptographic keys, 
rather than of length 1. That is, they would take a word like ABILITY and since 
A is letter 1 in the alphabet, B is letter 2, and ILITY are letters 9,12,9,20,25, re- 
spectively, they would rotate on through the alphabet by 1, 2,9, 12,9, —6, —1 letters 
to encrypt the first seven letters, and then repeat this process on the next seven. 
For example, we begin with the message, adding the word “ability” as often as is 
needed: 


weneedtomakeanexample 
plus 
abilityabilityability 
becomes 
xgwqnxspojwnumfzjyyfd 


This can again be “broken” by statistical analysis, though the longer the key length, 
the harder it is to do. Of course using a long key on a battlefield would be difficult, 
so one needed to compromise between security and practicality. A one-time pad, 


1 Steganography, hiding secrets in plain view, is another method for communicating secrets at a 
distance. In 499 B.C., Histiaeus shaved the head of his most trusted slave, tattooed a message on his 
bald head, and then sent the slave to Aristagoras, once the slave’s hair had grown back. Aristagoras then 
shaved the slave’s head again to recover the secret message telling him to revolt against the Persians. In 
more recent times, cold war spies reportedly used “microdots” to transmit information, and Al-Qaeda 
supposedly notified its terrorist cells via messages hidden in images on certain webpages. 
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where one uses such a long key that one never repeats a pattern, is unbreakable by 
statistical analysis. This might have been used by spies during the cold war and 
was perhaps based on the letters in an easily obtained book, so that the spy would 
not have to possess any obviously incriminating evidence. 


During the Second World War the Germans came up with an extraordinary 
substitution cypher that involved changing several settings on a specially built 
typewriter (an Enigma machine). The number of possibilities was so large that the 
Germans remained confident that it could not be broken, and they even changed 
the settings every day so as to ensure that it would be extremely difficult. The Poles 
managed to obtain an early Enigma machine and their mathematicians determined 
how it worked. They shared their findings with the Allies so that after a great 
amount of effort the Allies were able to break German codes quickly enough to 
be useful, even vital, to their planning and strategy[| Early successes led to the 
Germans becoming more cautious, and thence to horrific decisions having to be 
made by the Allied leaders to safeguard this most precious secret} 


The Allied cryptographers would cut down the number of possibilities (for 
the settings on the Enigma machine) to a few million, and then their challenge 
became to build a machine to try out many possibilities very rapidly. Up until then 
one would have to change, by hand, external settings on the machine to try each 
possibility; it became a goal to create a machine in which one could change what 
it was doing, internally, by what became known as a program, and this stimulated, 
in part, the creation of the first modern computers. 


Exercise 10.2.1. One can also create a cryptosystem using binary addition. For example, our 
key could be the 20-letter word k = 10111011101111011001. Then we could encrypt by using 
bit-by-bit addition; that is, OBO =1@Q1=0 and0@1=1@0=1. Therefore if the plaintext 
is p = 11100010101101000011, then c = p@k, namely 
10111 01110 11110 11001 
@ 11100 01010 11010 00011 
= 01011 00100 00100 11010. 


It is easy to recover the plaintext since p = c@k. Prove that one can recover the key if one knows 
the ciphertext and the plaintext. 


10.3. RSA 


In the theory of cryptography we always have two (imaginary) people, Alice and 
Bob, attempting to share a secret over an open communication channel, and the 
evil Oscar listening in, attempting to figure out what the message says. We will 
begin by describing a private key scheme for exchanging secrets based on the ideas 
in our number theory course: 


Suppose that prime p is given and integers d and e such that de = 1 (mod p—1). 
Alice knows p and e but not d, whereas Bob knows p and d but not e. The numbers 


? As portrayed, rather inaccurately, in the film The Imitation Game. 

°The ability to crack the Enigma code allowed the Allied leaders to save lives. However if they 
used it so often that every possible life was saved, the Germans would have realized that the Allies 
had broken the code, and then the Germans were liable to have moved on to a different cryptographic 
method, which perhaps the Allied codebreakers might have been unable to decipher. Hence the Allied 
leadership was forced to use its knowledge sparingly so that it would be available in the militarily most 
advantageous situations. As a consequence, they knowingly sent many sailors to their doom, knowing 
where the U-boats were waiting in ambush, but being forced not to disclose that information. 
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d and e are kept secret by whoever knows them. Thus if Alice’s secret message is 
ui she encrypts M by computing x = M® (mod p). She sends the ciphertext «x 
over the open channel. Then Bob decrypts by raising x to the dth power mod p, 
since 
a? = (M°)4=M*"=M (mod p) 

as de = 1 (mod p—1). As far as we know, Oscar will discover little by intercepting 
the encrypted messages x, even if he intercepts many different x, and even if he 
can occasionally make an astute guess at MM. However, if Oscar is able to steal the 
values of p and e from Alice, he will be able to determine d, since d is the inverse of 
e mod p—1, and this can be determined by the Euclidean algorithm, as discussed in 
exercise [3.5.5] (see the second proof of Corollary 8.5.2). He is then able to decipher 
Alice’s future secret messages, in the same way as Bob does. 


This is the problem with most classical cryptosystems; once one knows the 
encryption method it is not difficult to determine the decoding method. In 1975 
Diffie and Hellman proposed a sensational idea: Can one find a cryptographic 
scheme in which the encryption method gives no help in determining a decryption 
method? If one could, one would then have a public key cryptographic scheme, 
which is exactly what is needed in our age of electronic information, in particular 
allowing people to use passwords in public places (for instance when using an ATM 
without fear any lurking Oscar will be able to figure out how to impersonate them|® 


In 1977 Rivest, Shamir, and Adleman (RSA) realized this ambition, via a 
minor variation of the above private key cryptosystem{ Now let p # q be two 
large primed] and n = pq. Select integers d and e such that de = 1 (mod ¢(pq)). 
Alice knows pq and e but not d, while Bob knows pq and d. Thus if Alice’s secret 
message is M, the ciphertext is x = M* (mod pq), and Bob decrypts this by taking 
x? = (M°)4= M“” =M (mod pg) as de = 1 (mod ¢4(pq)) using Euler’s Theorem. 

Now, if Oscar steals the values of pq and e from Alice, will he be able to 
determine d, the inverse of e mod ¢(pq) = (p— 1)(q—1)? When the modulus was 
the prime p, Oscar had no difficulty in determining ¢(p) = p— 1. Now that the 
modulus is pg, can Oscar easily determine (p — 1)(q — 1)? If so, then, since he 
already knows pq, he would be able to determine pg + 1—(p—1)(q—1) = p+q and 
hence p and q, since they are the roots of x? — (p+q)a+pq =0. In practice, Oscar 
needs to only know d to factor n (see exercise 5.27 in [(CPO5]§). In other words, if 
Oscar can “break” the RSA algorithm, then he can factor n = pq, and vice versa. 


We have just shown that breaking RSA is more or less as difficult as factoring. 
Therefore RSA is a secure cryptographic protocol (when correctly implemented) 
if and only if n is a difficult integer to factor. But nobody truly knows whether 


4Of course a message is usually in words, but one converts the letters to numbers using some simple 
substitutions, like “01” for “A”, “02” for “B”,... , “26” for “Z”, etc., and concatenates these numbers. 
Thus “cabbie” becomes “030102020905”. It is this number that is our message that we denote by M. 

5When Alice uses a password, a cryptographic protocol might append a timestamp to ensure that 
the encrypted password (plus timestamp) is different with each use, and so Bob will get suspicious if 
the same timestamp is used again later. 

°It is now known that (Sir) Clifford Cocks, working for the British secret cryptography agency, 
GCHQ, had discovered this RSA algorithm in 1974, and it had been classified “Top Secret”. See 
https: //www.wired.com/1999/04/crypto/ for the story. 

7We will develop fast methods to find large primes in appendix 10C. 

’This uses Pollard’s p—1 method, which will not be discussed in this book, and is an algorithm 
that runs in probabilistic polynomial time. 
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factoring is a difficult problem, nor how to select integers that are provably hard to 
factor. In our current state of knowledge, we do not know any very efficient ways 
to factor arbitrary large numbers, but that does not necessarily mean that there is 
no quick way to do so|?] So why do we put our faith (and secrets and fortunes) in 
the difficulty of factoring? The security of a cryptographic protocol must evidently 
be based on the difficulty of resolving some mathematical problem but we do 
not know how to_prove that any particular mathematical problem is necessarily 
difficult to solve] However the problem of factoring efficiently has been studied 
by many of the greatest minds in history, from Gauss onwards, who have looked 
for an efficient factoring algorithm and failed. Is this a good basis to have faith in 
RSA? Probably not, but we have no better. (More on this at the end of section 
of appendix 10F.) 


Exercise 10.3.1. Let nm = 11x53 be an RSA modulus with encryption exponent e = 7. Determine 
d, the decryption exponent, by hand, using the Euclidean algorithm and the Chinese Remainder 
Theorem. 


Exercise 10.3.2. Let n = 5891 be an RSA modulus with encryption exponent e = 29 and 
decryption exponent d = 197. Use this information to factor n. 


10.4. Certificates and the complexity classes P and NP 


Algorithms are typically designed to work on any of an arbitrarily large class of 
examples, and one wishes them to work as fast as possible. If the example is input in 
é characters, and the function calculated is genuinely a function of all the characters 
of the input, then one cannot hope to compute the answer any quicker than the 
length, @, of the input. A polynomial time algorithm is one in which the answer 
is computed in no more than cl“ steps, for some constants c,A > 0, no matter 
what the input. These are considered to be quick algorithms. There are many 
simple problems that can be answered in polynomial time (the set of such problems 
is denoted by P and was already discussed in section [7.14] of appendix 7A); see 
section [0.15] of appendix 10F for more details. In modern number theory, because 
of the intrinsic interest as well as because of the applications to cryptography, we 
are particularly interested in the running times of factoring and primality testing 
algorithms. 


At the 1903 meeting of the American Mathematical Society, F. N. Cole came 
to the blackboard and, without saying a word, wrote down 


2°" — 1 = 147573952589676412927 = 193707721 x 761838257287, 


long-multiplying the numbers out on the right side of the equation to prove that he 
was indeed correct. Afterwards he said that figuring this out had taken him “three 
years of Sundays”. The moral of this tale is that although it took Cole a great deal 


°There are some families of numbers that we know are easy to factor (for example, see exercise 
[10.7.2] for a fast factoring method if p and q are close together) so we need to avoid those when selecting 
a modulus for RSA. 

10 Here we are talking about cryptographic protocols on computers as we know them today. There 
is a highly active quest to create quantum computers, on which cryptographic protocols are based on a 
very different set of ideas. 

11We can prove that almost all mathematical problems are “difficult to solve” (see section 
of appendix 10F), but we do not know how to identify one specific problem that is provably difficult to 
solve. This is a notoriously challenging and important open problem. 
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of work and perseverance to find these factors, it did not take him long to justify 
his result to a room full of mathematicians (and, indeed, to give a proof that he 
was correct). Thus we see that one can provide a short proof, even if finding that 
proof takes a long time. 


In general one can exhibit factors of a given integer n to give a short proof that 
n is composite. Such proofs, which can be checked in polynomial time, are called 
certificates. (The set of problems for which the answer can be checked in polynomial 
time is denoted by NP.) Note that it is not necessary to exhibit factors to give a 
short proof that a number is composite. Indeed, we already saw in the converse to 
Fermat’s Little Theorem, Corollary [7.2.1] that one can exhibit an integer a coprime 
to n for which n does not divide a®—!—1 to provide a certificate that n is composite. 


What about primality testing? If someone gives you an integer and asserts 
that it is prime, can you quickly check that this is so? Can they give you better 
evidence than their say-so that it is a prime number? Can they provide some sort 
of certificate that gives you all the information you need to quickly verify that the 
number is indeed a prime? We had hoped (see section [7.6) that we could use the 
converse of Fermat’s Little Theorem to establish a quick primality test, but we 
saw that Carmichael numbers seem to stop that idea from reaching fruition. Here 
we are asking for less, for a short certificate for a proof of primality. It is not 
obvious how to construct such a certificate, certainly not so obvious as with the 
factoring problem. It turns out that some old remarks of Lucas from the 1870s can 
be modified for this purpose. We begin with a sure-fire primality test, obtained as 
a consequence of Proposition [7.5.1 


Corollary 10.4.1. Suppose that n > 1 is a positive integer for which there exists 
an integer g with (g,n) =1 such that g’—! =1 (mod n) and g—/4 #1 (mod n) 
for every prime q dividing n—1. Then n is a prime. 


Proof. Proposition [7.5.])implies that g has order n—1 (mod n), so that the n—1 
reduced residues 1,9,...,g”~! are all distinct mod n. Therefore every integer a in 


the range 1 < a < n—1 is coprime to n, implying that n is prime. 


We are not suggesting that Corollary [10.4.1] provides a fast primality test. One 
can probably find g rapidly, if it exists, using Gauss’s algorithm which is discussed 
in section [7.15] of appendix 7B. However the algorithm requires one to completely 
factor n — 1, and we have no particularly fast factoring algorithms. On the other 
hand, if nm — 1 has already been factored, then one can proceed rapidly. Indeed 
we can provide a “certificate” to allow a checker to quickly verify that n is prime, 
which would consist of 

g and {q prime : g divides n — 1}. 
The checker would need to verify that g’~! = 1 (mod n) whereas g("—~)/4 # 1 
(mod n) for all primes g dividing n—1, something that can be quickly accomplished 
using fast exponentiation (as explained in section [7.13] of appendix 7A). 

There is a problem though: One needs (the additional) certification that each 
such q is prime. The solution is to iterate the above algorithm; and one can show 


that no more than logn odd primes need to be certified prime in the process of 
proving that n is prime. Thus we have a “short” certificate that n is prime. 
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At first one might hope that this also provides a quick way to test whether a 
given integer n is prime. However there are several obstacles. The most important 
is that we need to factor n — 1 in creating the certificate. When one is handed 
the certificate, n — 1 is already factored, so that is not an obstacle to the use of 
the certificate; however it is a fundamental impediment to the rapid creation of the 
certificate (and therefore to using this as a primality test). 


Exercise 10.4.1. Assuming only that 2 is prime, provide a certificate that proves that 107 is 
prime. 


Exercise 10.4.2. Let Fy, = 22” 4.1 with m > 2 be a Fermat number. 
Em —1 
(a) Prove that if there exists an integer g for which gq_ 7 = —1 (mod Fyn), then Fm is prime. 


(b) Deduce an “if and only if” condition for the primality of Fm using exercise[8.5.4 


10.5. Polynomial time primality testing 


Although the converse to Fermat’s Little Theorem does not provide a polynomial 
time primality test, one can further develop this idea. For example, we know that 
ae =<Lorl (mod p) by Euler’s criterion, and hence if a"z # +1 (mod n), 
then n is composite. This identifies even more composite n than Corollary [7.2.1 
alone, but not necessarily all n. We develop this idea further in section [10.8] of 
appendix 10A to find a criterion of this type that is satisfied by all primes but not 
by any composites. However we are unable to prove that this is indeed a polynomial 
time primality test without making certain assumptions that are, as yet, unproved. 


There have indeed been many ideas for establishing a primality test which 
is provably polynomial time, but this was not achieved until 2002. This was of 
particular interest since the proof was given by a professor, Manindra Agrawal, and 
two undergraduate students, Kayal and Saxena, working together with Agrawal 
on a summer research project. Their algorithm is based on the following elegant 
characterization of prime numbers. 


Theorem 10.1 (Agrawal, Kayal, and Saxena (AKS)). For given integer n > 2, let 
r be a positive integer <n, for which n has order > 9(logn)? modulo r. Then n is 
prime if and only if 


e nis not a perfect power, 
e n does not have any prime factor <r, 


e (x+a)” =x" +a mod (n,x2" —1) for each integer a,1 <a < 3/7 logn. 


The last equation uses “modular arithmetic” in a way that is new to us, but 
analogous to what we have seen: (a + a)” = a2" +a mod (n,2” — 1) means that 
there exist f(x), g(x) € Z[x] such that (a +a)” — (a” +a) =nf (a) + (a@” — 1)g(a). 

At first sight this might seem to be a rather complicated characterization of the 
prime numbers. However this fits naturally into the historical progression of ideas 
in this subject (indeed, see appendix 10G for a discussion and a proof), is not so 
complicated (compared to some other ideas in use), and has the great advantage 
that it is straightforward to develop into a fast algorithm for proving the primality 
of large primes. However, although the AKS algorithm satisfies the desire to have a 
rigorously proved polynomial time primality testing algorithm, it is not in practice 
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the fastest algorithm for establishing primality of the largest integers currently 
being considered[*] 


Exercise 10.5.1. Let p* be the highest power of prime p that divides n, with k > 1. 


(a) Prove that p* does not divide (p): 


(b) Deduce that n does not divide (})- 
(c) Show that if n is composite, then n does not divide all the coefficients of the polynomial 
(l+2)"—2"—-1. 


Exercise 10.5.2. Use the previous exercise to show: 
(a) n is prime if and only if (2 +1)” =2"+4+1 (mod n). 
(b) If (n,a) = 1, then n is prime if and only if (x +a)" =a" +a (mod n). 
(c) Prove that if n is prime, then (x +a)" = 2" +a (mod (n,z" — 1)) for any integer a with 
(a,n) =1 and any r>1. 


10.6. Factoring methods 


The problem of distinguishing prime numbers from composite numbers and 
of resolving the latter into their prime factors is known to be one of the most 
important and useful in arithmetic. It has engaged the industry and wisdom of 
ancient and modern geometers to such an extent that it would be superfluous to 
discuss the problem at length. Nevertheless we must confess that all methods 
that have been proposed thus far are either restricted to very special cases 
or are so laborious and difficult that even for numbers that do not exceed 
the limits of tables constructed by estimable workers, they try the patience 
of even the practiced calculator. And these methods do not apply at all to 
Jarger numbers .... It frequently happens that the trained calculator will 
be sufficiently rewarded by reducing large numbers to their factors so that it 
will compensate for the time spent. Further, the dignity of the science itself 
seems to require that every possible means be explored for the solution of a 
problem so elegant and so celebrated .... It is in the nature of the problem 
that any method will become more complicated as the numbers get larger. 
Nevertheless, in the following methods the difficulties increase rather slowly 
The techniques that were previously known would require intolerable 

labor even for the most indefatigable calculator. 
— from article 329 of Disquisitiones Arithmeticae (1801) by C. F. GAuss 


The first factoring method, other than trial division, was given by Fermat: His 
goal was to write a given odd integer n as x? — y”, so that n = (a — y)(x + y). He 
started with m, the smallest integer > \/n, and then looked to see if m? — n is a 
square. If so, say m? —n =r?, thenn =(m—r)(m+r). 

It is not easy to determine (at least by hand) whether a large integer is a square, 
though most are not. Fermat simplified his algorithm by quickly eliminating non- 
squares, by testing whether m? — n is a square modulo various small primes. If 
m? — n is not a square, then he tested whether (m+ 1)? — n is a square; if that 
failed, whether (m + 2)? — n is a square, or (m+ 3)? —n,..., etc. Since Fermat 
computed by hand he also noted the trick that 


(m+1)??-n= m —n+(2m+1), 
(m+2)? —n=(m+1)? —n+ (2m +3), etc., 


12Because other algorithms that we believe, but cannot prove, are polynomial time, run faster. 
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so that, at each step he only needed to add a relatively small number to the integer 
he had just tested, and the next add-on is just two larger than the previous one. 


For example, Fermat factored n = 2027651281 so that m = 45030. Then 
450307 — n = 49619 which is not a square mod 100; 
45031? —n = 49619 + 90061 = 139680 which is divisible by 2°, not 2°; 
450327 — n = 139680 + 90063 = 229743 which is divisible by 3°, not 37; 
45033? — n = 229743 + 90065 = 319808 which is not a square mod 3; etc. 


up until 45041? — n = 10207, so that 
n = 2027651281 = 45041?—1020? = (45041-1020) x (45041+1020) = 44021 x 46061. 


Exercise 10.6.1. Factor 1649 using Fermat’s method. 


Gauss and other authors further developed Fermat’s ideas, most importantly 
realizing that if 2? = y? (mod n) with « # ty (mod n) and (x,n) = 1, then 


ged(n, rt — y) : gced(n, L+ y) 
gives a non-trivial factorization of n. 


The issue now becomes to rapidly determine two residues x and y (mod n) with 
x # y or —y (mod n), such that x? = y? (mod n). Several factoring algorithms 
work by generating a sequence of integers a1, a2,..., with each 


a; = b? (mod n) but a; 4 03 


for some known integer b;, until some subsequence of the a;’s has product equal to 
a square, say 

yo = y+ i,. 
Then one sets 7? = (bj,---b;,)? to obtain x? = y? (mod n), and there is a good 
chance that gcd(n, x — y) is a non-trivial factor of n. 


We want to generate the a,;’s so that it is not so difficult to find a subsequence 
whose product is a square; to do so, we need to be able to factor the a;. This 
is most easily done by only keeping those a; that have all of their prime factors 
< B, for some appropriately chosen bound B. Suppose that the primes up to B 

Qi,1, Qi,2 Qi,k . 
are P1,Do,.~.;P—_- Ia; = py" po’ +> p,°", then let v; = (a;,1,4;2,..+50;,,), which 
is a vector with entries in Z. 


Exercise 10.6.2. Show that [],-; ai is a square if and only if });-; vi = (0,0,...,0) (mod 2). 


Hence to find a non-trivial subset of the a; whose product is a square, we simply 
need to find a non-trivial linear dependency mod 2 amongst the vectors v;. This is 
easily achieved through the methods of linear algebra and guaranteed to exist once 
we have generated more than k such integers a;. 

The quadratic sieve factoring algorithm selects the b; so that it is easy to find 
the small prime factors of the a;, using Corollary [2.3.1] There are other algorithms 
that attempt to select the b; so that the a; are small and therefore more likely to 
have small prime factors. We discuss some of these in appendix 10B. The best 
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algorithm, the number field sieve, is an analogy to the quadratic sieve algorithm 
over number fields. 


There are many other cryptographic protocols based on ideas from number 
theory. Some of these will be discussed in the appendices to this chapter. 


References: See and [Knu98], as well as: 


[1] Carl Pomerance, A tale of two sieves, Notices Amer. Math. Soc. 43 (1996), 1473-1485. 
[2] John D. Dixon Factorization and primality tests, Amer. Math. Monthly 91 (1984), 333-352. 


Additional exercises 


Exercise 10.7.1. Suppose that n is an odd composite integer. Prove that for at least half the 
pairs x,y with 0< 2,y <n and x? = y? (mod n), we have 1 < gcd(x# — y,n) <n. 


Exercise 10.7.2. Factor n = 62749. Let m = [/n] +1 = 251. Compute (m + i)? (mod n) 
for 7 = 0,1,2,... and retain those residues whose prime factors are all < 11. Therefore we have 
251? = 27. 32-7; 253? = 2?.32-5-7; 2577 = 2?-3-5%-11; 260? = 3277-11; 268? = 
3-5?-11?; 271? = 27-35-11 (mod n). Use this information to factor n. 


Exercise 10.7.3. Alice is sending Bob messages using RSA with public key modulus n = 
2027651281 and encryption exponent e = 66308903. Oscar recalls that n is the number Fermat 
factored in section|10.6] Find the decryption exponent for Oscar. 


We wish to determine how many different odd primes are involved in the Lucas 
certificate of section [10.4] 


Exercise 10.7.4. Let n be prime and suppose qi,...,qz are the odd prime factors of n — 1. 
(a) Prove that the product of these primes, Ni := q1--- qx, is < n/2. 
(b)t To certify that q1,...,q, are prime we need the set of odd prime factors of qi —1,...,q,—1. 
Let’s call those primes pi,...,p¢. Prove that the product of these primes, N2 := pi--- pe, 
is < N,/2*. 


(c) Generalize this argument to show that if there are r primes to be certified at the jth stage, 
then Nj41 < N;/2". 
(d)? Prove that if there are m primes that were certified to be prime during all the steps of this 
argument, then 27” <n. Explain why this implies that primality testing is in NP. 
Exercise 10.7.5.1 Suppose n is an odd composite, and a("—))/2 = 1 or —1 (mod n) for every a 
with (a,n) = 1. Deduce that a("-))/2 = 1 (mod n) for every a with (a,n) = 1 and that n isa 
Carmichael number. 


Appendix 10A. Pseudoprime 
tests using square roots of | 


In section [7.6] we noted that the converse to Fermat’s Little Theorem may be used 
to give a quick proof that a given integer n is composite: One simply finds an integer 
a, not divisible by n, for which a”~! 41 (mod n) (if this fails, that is, if a"~! =1 
(mod n) and n is composite, then n is called a base-a pseudoprime). Such a search 
often works quickly, especially for randomly chosen values of n, but can fail if the 
tested n have some special structure. For example, it always fails for Carmichael 
numbers, which have the property that n is a base-a pseudoprime for every a with 
(a,n) = 1. What can we do in these cases? Can we construct a test, based on 
similar ideas, that is guaranteed to recognize even these composite numbers? 


10.8. The difficulty of finding all square roots of 1 


Lemma[I0.1.iJjimplies that there are at least four distinct square roots of 1 (mod n), 
for any odd n which is divisible by at least two distinct primes. This suggests that 
we might try to prove that a given base-a pseudoprime n is composite by finding a 
square root of 1 (mod n) which is neither 1 nor —1. (If we can find such a square 
root of 1 (mod n), then we can partially factor n, as discussed in section[I0.1]) The 
issue then becomes: How do we efficiently search for a square root of 1? 


This is not difficult: Since n is a base-a pseudoprime, we have 
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1 (mod n), 


and so a"=~ (mod n) is a square root of 1 (mod n). By Euler’s criterion we know 
that if p is prime, then a°2 = (a/p) (mod p), so that a*= =1or-1 (mod p). Ifn 
is a base-a pseudoprime (and therefore composite), it is feasible that gas (a/n) 


(mod n), which would imply that n is composite. If az (mod n) is neither 1 nor 
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—1, this allows us to factor n into two parts, since 

n= ecd(a" = —1,n) - ecd(a"> +1,n). 
If n is composite and cc = (a/n) (mod n), then we call n a base-a Euler pseu- 
doprime. 

For example, 1105 is a Carmichael number, and so 211°4 = 1 (mod 1105). We 
take the square root, and determine that 2°°? = 1 (mod 1105). So this method fails 
to prove that 1105 is composite, since 1105 is a base-2 Euler pseudoprime. But, 
wait a minute, 552 is even, so we can take the square root again, and a calculation 
reveals that 222° = 781 (mod 1105). That is, 781 is a square root of 1 mod 1105, 
which proves that 1105 is composite. Moreover, since gcd(781 — 1, 1105) = 65 and 
gcd(781 + 1, 1105) = 17, we can even factor 1105 as 65 x 17] 

This property is even more striking mod 1729. In this case 1728 = 2° - 27 so we 
can take square roots many times. Indeed, taking successive square roots of 2!728 
we determine that 


1 = 1728 = 9864 — 9482 = 9716 (mod 1729), but then 24% = 1065 (mod 1729). 
This proves that 1729 is composite, and even that 
1729 = gcd(1064, 1729) x gcd(1066, 1729) = 133 x 13. 


This protocol of taking successive square roots can fail to identify that our 
given pseudoprime is indeed composite; for example, we cannot use 103 to prove 
that either 561 or 1729 is composite, since 


103° =1 (mod 561), and so 1037 =---=103°°°=1 (mod 561), 
10327 =—1 (mod 1729), and so 103°4=---= 10318 =1 (mod 1729), 


but such failures are rare (see exercise [10.8.7). 


Suppose that n is a composite integer with n — 1 = 2"m for some integer k > 1 
with m odd. We call n a base-a strong pseudoprime if the sequence of residues 


(10.8.1) a” (mod n), ge (mod n),..., g(r—1)/2 (mod n) 
is equal to either 
Le dicey d or “Lyd dy SLX oe sg 
where the *’s stand for any residue mod n. These are the only two possibilities if 


n is prime, and so if the sequence of residues in (10.8.1) looks like one of these two 
possibilities, then this information does not allow us to deduce that n is composite. 


On the other hand, if n is a not a base-a strong pseudoprime, then we say that 
a is a witness (to n being composite). To be more precise: 


Definition. Suppose that n is a composite odd integer and n— 1 = 2m for some 
integer k > 1 with m odd. Assume that n is a base-a pseudoprime; that is, art= 
(mod n). If a” =1 (mod n) or a’ = —1 (mod n) for some integer j > 0, then 
n is a base-a strong pseudoprime. Otherwise a is a witness (to the compositeness 
of n) and if @ is the largest integer for which qr! # —1 or 1 (mod n), then 
gcd(a"2" — 1,n) is a non-trivial factor of n. 


13 We have not factored 1105 into prime factors (since 65 factors further as 65 = 5 x 13), but rather 
into two non-trivial factors. 
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One can compute high powers modulo n very rapidly using “fast exponenti- 
ation” (a technique we discussed in section [7.13] of appendix 7A), so this strong 
pseudoprime test can be done quickly and easily. 


In exercise|L0.8.7]we will show that at least three-quarters of the integers a, 1 < 
a <n, with (a,n) = 1 are witnesses for n, for each odd composite n > 9. So can 
we find a witness quickly if n is composite? 


e The most obvious idea is to try a = 2,3,4,... consecutively until we find a 
witness. It is believed that there is a witness < 2(logn)?, but we cannot prove this 
(though we_can deduce this from a famous conjecture, the Generalized Riemann 
Hypothesid!4). 


e Pick integers a1,a2,...,a¢,... from {1,2,3,...,n — 1} at random until we 
find a witness. By what we wrote above, if n is composite, then the probability that 
none of a1, d2,...,ag are witnesses for n is < 1/4°. Thus with a hundred or so such 
tests we get a probability that is so small that it is inconceivable that it could occur 
in practice; so we believe that any integer n for which none of a hundred randomly 
chosen a’s is a witness is prime. We call such n “industrial strength primes” since 
they have not been proven to be prime, but there is an enormous weight of evidence 
that they are not composite. 


This test is a random polynomial time test for compositeness (like our test for 
finding a quadratic non-residue given at the end of appendix 8B). If n is composite, 
then the randomized witness test is almost certain to provide a short proof of n’s 
compositeness in 100 runs of the test. On the other hand, if 100 runs of the test 
do not produce a witness, then we can be almost certain that n is prime, but we 
cannot be absolutely certain since no proof is provided, and therefore we have an 
industrial strength prime. 


In practice the witness test accomplishes Gauss’s dream of quickly distinguish- 
ing between primes and composites, for either we will quickly get a witness to n 
being composite or, if not, we can be almost certain that our industrial strength 
prime is indeed prime. Although this solves the problem in practice, we cannot 
be absolutely certain that we have distinguished correctly when we claim that n is 
prime since we have no proof, and mathematicians like proof. Indeed if you claim 
that industrial strength primes are prime, without proof, then a cynic might not 
believe that your randomly chosen a are so random or that you are unlucky or .... 
No, what we need is a proof that a number is prime when we think that it is. 


Exercise 10.8.1. Find all bases b for which 15 is a base-b Euler pseudoprime. 


Exercise 10.8.2.' We wish to show that every odd composite n is not a base-b Euler pseudoprime 
for some integer b, coprime to n. Suppose not, i.e., that n is a base-b Euler pseudoprime for every 
integer b with (b,n) = 1. 
(a) Show that n is a Carmichael number. 
(b) Show that if prime p divides n, then p— 1 cannot divide ae. 
(c) Deduce that (b/n) = (b/p) (mod p) for each prime p dividing n. 
(d) Explain why (c) cannot hold for every integer b coprime to n. 


14We discussed the Riemann Hypothesis, and its generalizations, in sections [5.16] and [5.17] of ap- 
pendix 5D. Suffice to say that this is one of the most famous and difficult open problems of mathematics, 
so much so that the Clay Mathematics Institute has now offered one million dollars for its resolution 
(see http://www.claymath.org/millennium-problems/). 
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Exercise 10.8.3. Prove that F, = 22” +1 is either a prime or a base-2 strong pseudoprime. 


Exercise 10.8.4. Prove that if n is a base-2 pseudoprime, then 2” — 1 is a base-2 strong pseu- 
doprime and a base-2 Euler pseudoprime. Deduce that there are infinitely many base-2 strong 
pseudoprimes. 


Exercise 10.8.5. Pépin showed that one can test Fermat numbers Fi, for primality by using 
just one strong pseudoprime test; i.e., Fy, is prime if and only if 30Fim “1/2 = 1 (mod Fi). 
(a) Use exercise [8.5.4] to show if Fm is prime, then 3(¢’—)/? = —1 (mod Fyn). 
(b) In the other direction show that if 34’m—1)/2 = —1 (mod Fin), then ordp(3) = 22” when- 
ever prime p|Fm. 
(c) Deduce that Fm —1< p—1 in (b) and so Fy is prime. 


Exercise 10.8.6.1 (a) Prove that A := (4? + 1)/5 is composite for all primes p > 3. 
b) Deduce that A is a base-2 strong pseudoprime. 


Exercise 10.8.7. How many witnesses are there mod n? Suppose that n—1 = 2*m with m 
odd and k > 1, and that n has w distinct prime factors. Let gp be the largest odd integer dividing 
(p—1,n— 1), and let 2%+! be the largest power of 2 dividing gcd(p — 1: p|n). 

Prove that R<k-1. 

Show that is 1,1,...,1 if and only if a9 = 1 (mod p°) for every prime power p*||n. 
Show that there are [],,),, gp such integers a (mod n). 

Show that if is 1,1,...,1,—1,*,...,*, with r *’s at the end, then 0 <r < R, and 
that this holds if and only if a2"9» = —1 (mod p®) for every prime power p®||n. 

(e) Show that there are <[],,),, 2" gp such integers a (mod n). 


) 
) 
) 
) 


(f) Show the number of strong pseudoprimes mod n is 


i 4 
MA pad ba 
[ [2% 9) (+a+g0! c= =a +5 a) 


p\n 


(g) Prove that 2%, < P>- 1 and so deduce that the quantity in (f) is <3 ¢ ) and so is < +o(n n) 
ifw > 3. 
(h) Show that there are < +(n) reduced residues mod n which are not witnesses, whenever 
n > 10 with equality holding if and only if either 
e n= pq where p= 2m+1,q =4m-+1 are primes with m odd, or 
e n= pgr is a Carmichael number with p,q,r primes each = 3 (mod 4) (e.g., 7-19-67). 


Appendices. The extended version of chapter 10 has the following additional 
appendices: 


Appendix 10B. Factoring with squares. We explain various factoring algorithms 
such as random squares, the continued fraction method, and the quadratic sieve and 
its variations, which all construct a multiple of n as the difference of two squares. 


Appendix 10C. Identifying primes of a given size. We establish primality tests 
that work when n — 1 or n+ 1 is partially factored. This is useful in practice 
for quickly finding large primes and was used in the recent proof of the ternary 
Goldbach conjecture. 


Appendix 10D. Carmichael numbers. We discuss a construction to find families 
of Carmichael numbers with many prime factors. 


Appendix 10E. Cryptosystems based on discrete logarithms. We describe how 
the discrete log problem lies behind some strong cryptographic protocols, for ex- 
ample the Diffie-Hellman key exchange and the El Gamal cryptosystem. 

Appendix 10F. Running times of algorithms. No one knows whether there is a 


truly safe cryptographic protocol. We prove here that if there is one (appropriately 
defined), then the complexity class NP must be strictly larger than the complexity 
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class P; that is, PANP, the most famous and tantalizing open question of theoretical 
computer science. We also discuss how, although the overwhelming majority of 
mathematical problems are not in P, we have yet to identfy one specific example 
that is not in P. 

Appendix 10G. The AKS test. We prove that the AKS test, as given in Theorem 
[10.1] is a valid primality test, though we do not establish its running time. 

Appendix 10H. Factoring algorithms for polynomials play an important role in 
number theory. Here we present the very useful Eisenstein irreducibility criterion 
to test whether a given polynomial can be factored into smaller parts. 


OOO 
Chapter 11 


Rational approximations 
to real numbers 


How well can we approximate a real number by rational numbers? Obviously we 
can approximate 7 by 3,3.1,3.14, etc., but there are even better approximations 
like 3, 2 333 oe ... (see section [[1.9] of appendix 11B for details). Are these the 
“best” approximations? And how do we measure how good an approximation is? 
We study these questions in detail in this chapter. 

To start with we could ask how well we could approximate a rational number 
a = p/q with (p,q) = 1 and q > 1, by other, unequal, rational numbers. For any 
rational m/n with n > 1, which is 4 p/q, the difference is 


ae ae 


(11.0.1) 
qn qn 


Pp m 
qd nm 


since |pn — gm| is a non-negative integer that cannot be 0 as p/q 4 m/n, and so 
must be > 1. We have therefore shown that the difference between rational a and 
an approximation m/n is at least some constant (in this case 1/q) times 1/n. We 
will see in the next section that one obtains much better approximations when a is 
real and irrational. 


11.1. The pigeonhole principle 


If real irrational a is very close to m/n, then na must be close to m, so we are 
interested in how close the integer multiples of a given real number a can be to an 
integer. Dirichlet noted that one can get a surprisingly good answer to this question 
using the pigeonhole principle. 
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Theorem 11.1 (Dirichlet’s Theorem). Suppose that a is a given real number. 
For every integer N > 1 there exists a positive integer n < N such that 


|lna—m| < — 
for some integer m. In other words, 
| m 1 


a-—| < —. 
n nN 


Proof. The N +1 numbers {0-a}, {1-a}, {2-a},..., {N-a} (where {t} denotes 
the fractional part of t) all lie in the interval [0,1). The intervals 


hited Pe 


partition [0, ye and so each of our N + 1 numbers lies in exactly one of the N 
intervals. Therefore some interval must contain at least two of our numbers by the 
pigeonhole principle, say {ia} and {ja} with 0 <i <j < N,so that |{ia}—{ja}| < 
x Therefore, ifn = j — i, then 1 <n < N, and if m := [ja] — [ia] € Z, then 


na—m = (ja— ia) — ([ja] — [ta]) = {ja} — fia}, 


and the first result follows by taking absolute values. The second result follows by 
dividing through by n. 


Exercise 11.1.1. Prove that for any irrational real number a there are arbitrarily small real 
numbers of the form a+ ba with a,b € Z. 


Corollary 11.1.1. If a is a real irrational number, then there are infinitely many 
pairs m,n of coprime integers for which 

m 1 

jo-7) <<. 


n n? 


For large n this is a far better approximation of a than one can obtain for 


rational numbers, as we saw in (11.0.1). 


Proof. Suppose that we are given a finite list, (m;,n;), 1 <j < k, of solutions 
to this inequality. Since this is a finite list there is some solution with |nj;a — m,| 
minimal, and |nj;a — m,| must be > 0 as a is irrational. Therefore we can let N 
be the smallest integer > 1/ mini<j<,{|nj;a — m,|}. By Dirichlet’s Theorem there 
exists n < N such that 


Now 
1 ; 
|na—m| < NW < |nja—m,| for all 7, 
and so (n,m) is another solution to the inequality, not included in the list. This 


implies that any finite list of solutions can be extended, and so there are infinitely 
many solutions. 


That is, each point of [0,1) lies in exactly one of these intervals, and the union of these intervals 
exactly equals [0, 1). 
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Dirichlet’s Theorem is a very useful result as we will now exhibit by reproving 
two big results from earlier in the book: 


Another proof of Corollary 3.5.2] [Jf (a,m) = 1, then a has an inverse mod 
m.] Take m > 2. Let a= * and N = m — 1 in Dirichlet’s Theorem so that there 
exist integers r and s with r < m-—1 such that |ra/m — s| < 1/(m— 1); that is, 


|ra — sm| < m/(m—1) < 2. Hence ra — sm = —1,0, or 1. It cannot equal 0 or 
else m|sm = ar and (m,a) = 1 so that m|r which is impossible as r < m. Hence 
ra = +1 (mod m) and so +r is the inverse of a (mod m). 


We saw an important use of the pigeonhole principle in number theory in the 
proof of Theorem[9.1| and this idea was generalized significantly by Minkowski and 
others. Now we reprove Theorem using Dirichlet’s Theorem: 


Another proof of Theorem [If —1 is a square mod n, then n is the sum of 


two squares.| Suppose that r? = —1 (mod n). By Dirichlet’s Theorem there exists 
a positive integer b < \/n such that |— 5— | < Wn for some integer c. Multiplying 


through by bn we deduce that |a| < /n where a = rb+cn. Now a= rb (mod n) 
and so a? +b? = r2b? +b? = (r?+1)b? =0 (mod n), and 0 < a?+b? <n+n= 2n, 
and so we must have a? + b? =n. 


For irrational ~ one might ask how the numbers {a}, {2a},..., {Na} are 
distributed in [0,1) as N — ov, for a irrational. In section [LL7] of appendix 11A 
we will show that the values are dense and even (roughly) equally distributed in 
(0,1). This ties in with the geometry of the torus and with exponential sum theory. 


The next two exercises are multidimensional generalizations of Dirichlet’s The- 
orem with not dissimilar proofs. 


Exercise 11.1.2 (Simultaneous approximation). Suppose that a1,...,a, are given real numbers. 
Prove that for any positive integer N there exists a positive integer n < N* such that, for each j 
in the range 1 < j < k, there exists an integer m; for which 


1 
jnaj—mj| < —. 
N 


Deduce that given a1,...,a, € R there exist integers g, 1 <q < Q, and pi1,...,px such that 


1 
> git ilk’ 


1 


Pl 
ca 
= giti/k 


q 


P2 
q 


ay a2 


Exercise 11.1.3. Suppose that ai,...,a@,% are given real numbers. Prove that for any positive 
integer N there exist integers n1,n2,...,%%, not all zero, with each |n;| < N, and an integer m 
for which 


Jniai +nzag+---+npag—m| < NE’ 
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11.2. Pell’s equation 


Perhaps the most researched equation in the early history of number theory is the 
so-called Pell equation? Are there non-trivial integer solutions x, y to 


a? — dy? =1? 


(The “trivial solutions” are x = +1 and y = 0.) The best-known ancient example 
comes from comparing the number of points in triangles of points, with the number 
of points in squares of points: 


This triangle has 1+ 2+3-+4= 10 points, whereas this square has 4 x 4 = 16. In 
general a triangle with m rows has m(m+1) points, and a square with n rows has n? 
points. The numbers appearing in these two lists are mostly different, but there are 
exceptions, for example, 1, and then 36 = a = 6”, and then 1225 = ae = 357. 
So are there arbitrarily many “triangular numbers” that are also squares? More 
precisely, we are asking whether there are infinitely many pairs of integers m,n 
such that 
mn 1) sane 

It makes sense to clear denominators and to “complete the square” on the left side. 
Then we get 


1 
(2m +1)? =4m? + 4m4+1=8:- mn ED aan? 41, 
Taking « = 2m+1 and y = 2n gives a solution to the Pell equation 
a? — 2y? = 1. 


On the other hand note that any solution to the Pell equation must have x odd, so 
is of the form 2m+1, which implies that 2y? = 27-1 = 1—1=0 (mod 8) and so y 
is even and therefore must be of the form 2n. (Our examples of triangular numbers 
above therefore correspond to the solutions 3? — 2-2? =1, 177 —2-12? =1, and 
99? — 2-70? = 1 to Pell’s equation.) So we have proved that the set of triangular 
numbers that are also squares are in 1-to-1 correspondence with the positive integer 
solutions to this Pell equation. 


We will show in Theorem [11.2] that there is a non-trivial solution to Pell’s 
equation x? — dy? = 1 for every non-square integer d > 1. This was evidently 
known to Brahmagupta in India in 628 A.D., and one can guess that it was well 


2In 1657 Fermat challenged Frénicle, Brouncker, Wallis, and “all mathematicians” to create a 
method for finding solutions to Pell’s equation. Brouncker showed that he had done so by determining 
the smallest solution for d = 313, namely x = 32188120829134849, y = 1819380158564160. It seems 
that Euler attributed the equation to Pell because Rahn published an algebra book with Pell’s help in 
1658, which contained an example of this type of equation. The name stuck. 


11.2. Pell’s equation 


209 


understood by Archimedes far earlier, judging by his “Cattle Problem”: 


The Sun god’s cattle, friend, apply thy care 

to count their number, hast thou wisdom’s share. 
They grazed of old on the Thrinacian floor 

of Sic’ly’s island, herded into four, 

colour by colour: one herd white as cream, 

the next in coats glowing with ebon gleam, 
brown-skinned the third, and stained with spots the 
last. 

Each herd saw bulls in power unsurpassed, 

in ratios these: count half the ebon-hued, 

add one third more, then all the brown include; 
thus, friend, canst thou the white bulls’ number tell. 
The ebon did the brown exceed as well, 

now by a fourth and fifth part of the stained. 

To know the spotted — all bulls that remained — 
reckon again the brown bulls, and unite 

these with a sixth and seventh of the white. 
Among the cows, the tale of silver-haired 

was, when with bulls and cows of black compared, 
exactly one in three plus one in four. 

The black cows counted one in four once more, 
plus now a fifth, of the bespeckled breed 

when, bulls withal, they wandered out to feed. 
The speckled cows tallied a fifth and sixth 


of all the brown-haired, males and females mixed. 
Lastly, the brown cows numbered half a third 
and one in seven of the silver herd. 

Tell’st thou unfailingly how many head 

the Sun possessed, o friend, both bulls well-fed 
and cows of ev’ry colour — no-one will 

deny that thou hast numbers’ art and skill, 
though not yet dost thou rank among the wise. 
But come! also the foll’wing recognise. 


Whene’er the Sun god’s white bulls joined the 
black, 

their multitude would gather in a pack 

of equal length and breadth, and squarely throng 
Thrinacia’s territory broad and long. 

But when the brown bulls mingled with the flecked, 
in rows growing from one would they collect, 
forming a perfect triangle, with ne’er 

a diff’rent-coloured bull, and none to spare. 
Friend, canst thou analyse this in thy mind, 

and of these masses all the measures find, 

go forth in glory! be assured all deem 

thy wisdom in this discipline supreme! 


— from an epigram written to ERATOSTHENES of Cyrene 
by ARCHIMEDES (of Alexandria), 250 B.ch 


The first paragraph involves only linear equations. To resolve the second, one needs 
to find a non-trivial solution in integers u,v to 


u? — 609 - 7766v2 = 1. 


The smallest solution is enormous, the smallest herd having about 7.76 x 1029544 
cattle: It wasn’t until 1965 that anyone was able to write down all 206545 decimal 
digits! How did Archimedes know that the solution would be ridiculously large? 
We don’t know, though presumably he did not ask this question by chance. 


The next result, the main result of this section, presumably known to many 
ancient mathematicians, is that there is always a solution to Pell’s equation. 


Theorem 11.2. Let d > 2 be a given non-square integer. There exist integers x,y 
for which 


with y £0. Ifx1,y1 yields the smallest solution in positive integers|4] then all other 
solutions are given by the recursion 
forn> 1. 


Intl =T%Iyt+dyiyn and Yns1 =LiYnt+ yen 


We call the pair (a1, y,) the fundamental solution to Pell’s equation. Another way 


3 Archimedes, The Cattle Problem, in English verse by S. J. P. Hillion & H. W. Lenstra Jr., 
Mercator, Santpoort, 1999. 

4We measure the size of the solutions in positive integers x,y by the number x + Vdy, though we 
would have the same ordering if we used either x or y. 
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to write the recursion is that 
In + Vdyn = (a,4+ Vdy,)” for every integer n > 1, 
where we match the coefficients of Vd on each side to determine yn, and what 


remains, the coefficients of 1 on each side, to determine xy. 


Proof. We begin by showing that there always exists a solution to 2? — dy? = 1 
in integers with y 4 0. By Corollary there exist infinitely many pairs of 
integers (m,n) such that |\Vd—™| < +4. For these pairs (m,n) we have 


|m? — dn?| =n? |Vd 7|-|va+ =| < |va+ =| < 2Vd+|Vd—- =| <2Vd+1. 


This implies that |m? — dn?| must be an integer < 2Vd +1, so there must be 
some non-zero integer r, with |r| < 2Vd-+ 1, for which there are infinitely many 
pairs of positive integers m,n such that m? — dn? = r. Pick the smallest such r. 
We can assume that each (m,n) = 1 or else if (m,n) = g occurs infinitely often, 
then we have infinitely many solutions (m/g)? — d(n/g)? = r/g’, contradicting the 
minimality of r. 

Since there are only r? pairs of residue classes (m mod r, n mod r) there 
must be some pair of residue classes a, b such that there are infinitely many pairs of 
integers m,n for which m? — dn? =r with m=a (mod r) andn=b (mod r). Let 
m,n, be the smallest such pair, and m,n any other such pair, so that m? — dn? = 
m? — dn? = r with m; = m (mod r) and ny = n (mod r). This implies that 
r|(min — nym) and 

(mym — dnyn)? — d(min — nim)? = (mi — dn?)(m? — dn?) =r’, 
so that r? divides r?+d(min—n ym)? = (mym—dn,n)?, and thus r|(mym—dnin). 
Therefore x = |mym — dnyn|/r and y = |m in — nym|/r are integers for which 
a? — dy? = 1. 


Exercise 11.2.1. Show that y 4 0 using the fact that (m,n) = 1 for each such pair m,n. 


We measure the size of solutions to Pell’s equation, using the number x + Vdy. 
If x,y > 0, then this is > 1. There are four solutions associated with each solution 
in positive integers u,v, and for these we have 


u+Vdv>1>u—Vdv >0>—-u+Vdv > -1>—-u— Vado. 


Therefore x,y > 0 if and only if « + Vdy > 1. 


Let 21,41 be the solution to x2 — dy? = 1 in positive integers with x; + Vdy1 
minimal. We claim that all other solutions with x, y > 0 take the form x + Vdy = 
(a, +V dy)”. If not, let x,y be the counterexample with «, y > 0 for which «+Vdy 
is smallest. Now x + Vdy >fit Vdy; since x1 + Vdy, is minimal. 

If X = a,x—dyy and Y = x1y—yi2, then X?—dY? = (x7 —dy?)(x?—dy”) = 1, 


and 
_ «t+Vvdy 
TT Vdy 


X+VdY = (x, — Vdy,)(a + Vdy) 


which implies that 
1<X4+VdY < 2+ Vdy. 
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Hence X,Y > 0, and since x,y was the smallest counterexample, we deduce that 
X+VdY = (a1 + Vdy1)™ for some integer m > 1, 


and therefore «+ WVdy = (a1 +Vdy1)(X + VdY) = (#1+Vdy1)"™*?, a contradiction. 


If we define x, + V dyn = (a1 + Vdy,)”, then we obtain the recursion given in 
the theorem by an easy induction argument. We also deduce that the tn, yn > 0 
and so #1 <a <--- and y; < yo <--- from the recursion formulas. 


Exercise 11.2.2. Prove that if a+ /Vdb = x + /dy where a,b, x,y,d are integers and d is not a 
square, then a= x and b= y. 


Exercise 11.2.3. Prove, by induction, that tn42 = 2%1%n+41—2n and yn+2 = 2%1Yn+1 — Yn for 
alln > 0. 


Exercise 11.2.4. Show that all solutions to Pell’s equation (not just the positive integer solutions) 
are given by the values +(21 + Vdyi)” (not just “+”), with n € Z (not just n € N). 


For technical reasons it is actually best to develop the analogous theory for the 
solutions to x? — dy? = +4, as in appendix 11B, when we revisit Pell equations. 


In the second half of the proof we saw how all of the solutions in positive 
integers can be generated from a fundamental solution. The proof is interesting 
in that it works by “descent”: Given a solution we find a smaller one. This is a 
technique that we saw several times in chapter 6. We will see it play a central role 
in section {11.3} and later when we study elliptic curves in chapter 17. 


The proof of Theorem [11.2] is not constructive, in that the proof does not 
indicate how to find a solution. In Lemma of appendix 11B we will show 
how to find solutions using the continued fraction for Vd (as was known to all 
of the ancient mathematicians discussed here). How large is the smallest solution 
to Pell’s equation? We saw that it can be surprisingly large, as in Archimedes’s 
cattle problem. One can prove that the smallest solution is < (8d)¥@ (sce section 
[3.7] of appendix 13B). However what is surprising is that the smallest solution 
seems to usually be this large. This is not something that has been proved; indeed 
understanding the distribution of sizes of the smallest solutions to Pell’s equation 
is an outstanding open question in number theory. 


In Theorem [11.2] we saw that if d > 1 is a non-square integer, then there are 
always solutions in integers x,y > 0 to Pell’s equation x? — dy? = 1. This implies 
that 

Vdy(« — Vdy) < (« + Vdy)(« — Vdy) = 1, 
and so, dividing through by Vdy?, we exhibit rational approximations x/y to Vd 
that satisfy 


> on 
vd |< ae 
which are better approximations than those that are given by Corollary [1.1.1] 
Another issue is whether there is a solution to u? — dv? = —1, the negative 
Pell equation. Notice, for example, that 2? — 5-1? = —1. Evidently if there is a 
solution, then —1 is a square mod d, so that d has no prime factors = —1 (mod 4). 
Moreover d cannot be divisible by 4 or else u? = —1 (mod 4) which is impossible. 


We saw that x? — dy? = 1 has solutions for every non-square d > 1, and one might 
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have guessed that there would be some simple criterion to decide whether there are 
solutions to u? — dv? = —1, but there does not appear to be. For example there are 
no solutions for d = 34, 205, or 221, yet in each case there is no congruence that 
easily explains why not. This is a subject of ongoing research. We will discuss the 
negative Pell equation in the next paragraph as well as in section [1.13] of appendix 
11B. 


The case d = 5 has many fascinating properties. For example 
PhP a4, 97 —5- 1 a4, PH Pe, PR 58? =F. ss 

t+V5y _ 
2 


All these solutions to x? — 5y? = —4 or 4 are given by = Erne If 
there are solutions to x? — dy? = +4 with x, y both odd (as in this example), then 
1—d=2x? — dy? = 0 (mod 4); that is, d= 1 (mod 4). If d=1 (mod 4), then the 


proof of Theorem |L1.2}/can be used to prove there exist integers u,v > 0 such that: 


All solutions to «? — dy? = +4 with x,y > 0 are given by 
«+ Vdy u+vdv . 
5 = 5 for some integer n > 1. 


To establish that there is at least one solution take x = 2r,y = 2s from a solution 
to r? — ds? = 1 given by Theorem Now select the solution to our equation 
with utydy > 1 but minimal. The proof of Theorem [11.2] suitably modified, then 
gives that all other solutions are given by a power of this first one. 

We call utv/du the fundamental solution to Pell’s equation and denote it by eq. 


Exercise 11.2.5. The smallest solution to x? — 2y? = 1 is given by (a, y) = (3,2), which implies 
that 2% and 32 are consecutive powerful numbers (integer n is powerful if p? divides n whenever 
a prime p divides n). Use the theory of the solutions to 7? — 2y? = 1 to prove that there are 
infinitely many pairs of consecutive powerful numbers. 


11.3. Descent on solutions of x? — dy? =n, d>0 


Let x1, y; be the fundamental solution to Pell’s equation, and let eg = 7; + yiv d 
as in Theorem |[L1.2| so that eg > 1. 


Proposition 11.3.1. Given integers d, n > 0, the integer solutions x,y to x? — 
dy? =n are all given by +e%8 for some integer k, where 


BEB: {u+ Vdu € [/n, /neg) : u,v >1 and uv? — dv? = n}. 


Proof. Given a solution to x? — dy? = n, let a = |x + yVd|. As eq > 1 the 
sequence of numbers 1, €g, a, ... increases to infinity, and the sequence of numbers 
1, in oan ... decreases to 0. Therefore there exists a unique integer k such that 


&< lance”. 


Let 6 := lalez”*, so that \/n < B < \/neq. Therefore a is of the form +Ge%, where 
B€[J/n, /neq). Writing 8 = u+ Vdv we obtain 


uw? — dv? = |(x + yVd)(x — yVd)| - ((x1 + yr Vd) (21 — yiV'd))* 


= (2? — dy)(2i — dyj)"* =n-1-* =n. 
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Moreover for a solution of r? — ds? = n where n > 0, with r,s > 0, we have 
y:i=rtsVd>Vn>n/y=r—sVd>0>-r+svd>-—-r-—svd, 


so of these four closely related solutions the unique one > \/n has both coordinates 
positive. In particular this implies that u,v > 0, so that 6 € B. 


For n = 1 we have B = {1}. In some questions B can be empty; in others it 
can be large. For example, there are no solutions to 2? — dy? = n in integers if n 
is not a square mod d. 


6 
In the example x? — 5y? = 209, we have 5 = (434) =9+4,/5 and, after a 
brief search we discover that B = {17+ 4/5, 47+ 20/5}. 


Exercise 11.3.1. Find all integer solutions x,y to (a) 2? — 5y? = —4; (b) z 
(c) x? — 5y? =—1; (d) x? — 5y2 =1; (e) a? — 5y? = —20; (f) x? — 5y? = -11. 


Exercise 11.3.2. Prove that for any non-square positive integer d and integer n there is either 
no solution or infinitely many solutions to x? — dy? = n. 


11.4. Transcendental numbers 


In section B.4] we proved that Vd is irrational if d is an integer that is not the 
square of an integer. We can also prove that certain numbers are irrational simply 
by establishing how well they can be approximated by rationals: 


Proposition 11.4.1. Suppose that a is a given real number. Then a is irrational 
if and only if for every integer q > 1 there exist integers m,n such that 


1 
0<|na-—m|< -. 
q 


Proof. If a is rational, then a = p/q for some coprime integers p,q with q > 1. 
For any integers m,n we then have na — m = (np — mq)/q. Now, the value of 
np — mq is an integer = np (mod q). Hence |np—mg| = 0 or is an integer > 1, and 
therefore |na — m| = 0 or is > 1/g. 

If a is irrational, then Corollary [1.1.1] tells us that there are arbitrarily large 
coprime integers m,n for which 0 < |na—m| < 1. We select n > q to prove the 
result claimed here. 


There are several other methods to prove that numbers are irrational, but it 
is more challenging to prove that a number is transcendental, that is, that the 
number is not the root of a polynomial with integer coefficients] Next we show 
that algebraic numbers cannot be too well approximated by rationals. This suggests 
a method to identify a number as transcendental, generalizing how we identified 
irrationality in Proposition [11.4.1] 


Theorem 11.3 (Liouville’s Theorem). Suppose that a is a root of an irreducible 
polynomial f(x) € Zia] of degree d > 2. There exists a constant cy > 0 (which 


5The root of a polynomial with integer coefficients is called an algebraic number. 
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depends only on al such that for any rational p/q with (p,q) = 1 and q > 1 we 
have 


Pp Ca 
a--| > —. 
A q 
Proof. Since I := [a — 1,a + 1] is a closed interval, there exists a bound B > 1 


for which |f’(t)| < B for all t € I. We will prove the result with cg = 1/B. If 
p/q € I, then ja — p/q| > 1 > ca > Ca/q@ as desired. Henceforth we may assume 
that p/q € I. 

If f(z) = ar fix’ with each f; € Z, then q@f(p/q) = ae fip'q?* € Z. 
Now f(p/q) £0 since f is irreducible of degree > 2 and so |q4f(p/q)| > 1. 

The mean value theorem tells us that there exists t lying between a and p/q, 
and hence in J, such that 


f(a) — f(e/a) 


eC oe ape 
Therefore, as f(a) = 0, 
a 4 _ Iefe/Ol 5 1 _ ea 
q qf'(t)| ~ Bat qt 


Often students first learn to prove that there are transcendental numbers by 
showing that the set of real numbers is uncountable; in contrast, the set of algebraic 
numbers is countable, so the vast majority of real numbers are transcendental. 
This argument yields that most real numbers are transcendental, without actually 
constructing any! (See section [1.16] in appendix 11D.) The great advantage of 
Liouville’s Theorem is that it can be used to actually construct transcendental 
numbers. 


Corollary 11.4.1. A Liouville number is an irrational real number a such that 
for every integer n > 1 there is a rational number p/q with (p,q) =1 and q > 1 for 
which 
Pp 
ee 
q 


Every Liouville number is transcendental. 


i 
qr 


< 


Proof. Let a be a Liouville number. Suppose that a is algebraic so that there 
exist d and cg as in Liouville’s Theorem. Select n > d sufficiently large so that 
2”-4 > 1/c,. Then, selecting the approximation p/q with q > 1 as in the hypothesis 
we have 


C 
hs > 


1 

q get 
by Liouville’s Theorem. Therefore 2"~4 > 1/cg > q"~4, contradicting that q > 2. 
Therefore a is not algebraic and so must be transcendental. 


n—d 


®Tn this chapter there are several constants like cg which depend only on the variable given in the 
subscript. We do not attempt to be more precise about the constant because calculating a value for 
the constant will make things much more complicated, yet one will gain little from knowing its precise 
value. 


11.4. Transcendental numbers 215 


For example 
to tg as 
a= t T 
10-102! 10! 
is a Liouville number, since if p/q with q = 10™ is the sum of the first n terms, 


then 0 < a—p/q < 2/q"*! < 1/q”. 


Liouville numbers are easily identifiable transcendental numbers, but there are 
many transcendental numbers which are not Liouville numbers, like 7 and e. 


Liouville’s Theorem has been improved to its, more or less, final form by Roth. 
To explain his result we have to introduce an € and that sort of thing: For any 
fixed « > 0 (which should be thought of as being small), there exists a constant 
K- > 0, which depends on e¢, and is chosen so it works in the proof|"] In the notation 
in Roth’s Theorem we have to go a little further than this since the constant also 
depends on the value of a we need to approximate, so our constant is Cy, which 
depends on both a and e, but nothing else. These dependencies do restrict our use 
of the inexplicit constants cy,-; for example, one cannot compare the constants that 
arise from different values of a. 


Theorem 11.4 (Roth’s Theorem, 1955). Suppose that a is an irrational real 
algebraic number. For any fixed € > 0 there exists a constant cy,< > 0 such that 
for any rational p/q with (p,q) =1 and q>1 we have 

2 Coxe 


Oo g) = ge 


The exponent “2+ e” in Roth’s Theorem cannot be improved much since if a 
is irrational, then there are infinitely many p/q with 


a- 2| < a by Corollary 
11.1.1} We will prove that approximations which are a little better than this must be 
convergents of the continued fraction of a (see Corollary [LL.10.1in section 11.10] of 


appendix 11B). The “worst approximable” irrational number is therefore 14V6 | for 


which the best approximations are given by F,,41/F;, where F;, is the nth Fibonacci 
number. One can show that the difference, |14¥° — a, is roughly 1/(/5F?) 
with an error < 1/F#. 


Exercise 11.4.1. Prove that if a € C\R, then there exists a constant 84 > 0 such that |a—p/q| > 
Ba for all rational approximations p/q. 


Exercise 11.4.2. Prove that if f(t) = aq Te att —aj), then f’(a;) = ag Ie jeilQi — O;). 


There are many beautiful applications of Roth’s Theorem to Diophantine equa- 
tions. We highlight one: 


Corollary 11.4.2 (Thue-Siegel Theorem). Suppose that f(t) = a9 + ait +---+ 
aat? € Zit] is an irreducible polynomial of degree d > 3. Then for any integer A 
there are only finitely many pairs of integers m,n for which 


n? f(m/n) = agn? + ayn?-1m +--++aam? = A. 


7A proof that is far too involved for inclusion in this book. 
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Proof. If A = 0, the only solution is m = n = 0, as f is irreducible. So we 
may assume that |A| > 1 and write f(t) = aa iene: — aj); the a; are distinct as 
f(t) is irreducible. For any given pair of integers m,n select j so that |a; — | is 
minimized. If 7 4 7, then 


m m m 
2 |64— —| > |x — —| + la; — —| > laa — 0, | 
J JN) 

n n n 


so that, since f’(a;) = aa] ]i<ica, i¢j(@j — 0%) (as in exercise 11.4.2), 


m| |f'(ag)| m lou, — a m 
los — FT| Pgeet = les - Flee TE 3“! saa TT fae- F 
1<i<d 1<i<d 
ij 
_ _ jagn? + ayn?tm +--+ + agm"| |.A| 
We now apply Roth’s Theorem with a = a; and € = $, so that 
m Ca; ,1/2 
< > J? . 
a m | ~  |njo/2 


Substituting this into the previous equation, then squaring both sides and multi- 
plying through by denominators, we obtain either |n| < 1 or 


|n|/2 < (|n|/2)?*-° < B 
where B = 8max;(A/cy,.1/2|f/(aj)|)?. Either way there are only finitely many 


possibilities for integer n, and for each such n there are at most d integers m which 
can be roots of the polynomial 


age? +as>4- ayn? te (agn@ —A)=0. 


This proves the claimed result. 


11.5. The abc-conjecture 


In chapter 6 we discussed various Diophantine equations with three monomials like 
x? + y? = 27, even x" + y” = 2” for any integer n > 2, and there are others of 
interest like «? — y7 = 1. So how do we determine which of these have infinitely 
many solutions in integers? This is not an easy question, and indeed the focus of 
a lot of research. One modern approach (motivated by deep considerations) is to 
study the prime powers dividing each term. 


We begin by proving the following consequence of Roth’s Theorem: 


Corollary 11.5.1. Let F(x,y) € Z[x,y| be a homogenous polynomial of degree d, 
with no repeated linear factors. For each € > 0 there exists a constant kp. > 0 such 
that for any coprime positive integers m,n: 

Either F(m,n) =0 or |F(m,n)| > Kre|n|o-2-€. 


In other words, either F'(m,n) = 0 or |F'(m,n)| is large. 


Proof. A homogenous polynomial in two variables takes the form 


d 
F(z,y) = Do ayaty??. 
j=0 
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As there are no repeated factors, F (x,y) can be divisible by y but not y?. Then 
f(t) = F(t, 1) as a polynomial of degree d— 1 or d (depending on whether F(z, y) 
is divisible by y or not) and has no repeated roots (as F’ has no repeated linear 
factors). 


Now if m and n are coprime integers, then either F(m,n) = 0 or, from the 
inequality in the proof of the Thue-Siegel Theorem, 


F(m, a 
Sage Fou = stom 2 aie 2 Ga 


with Kp. := min; Ca,,|f/(aj)|/24~!, where the last inequality follows from Roth’s 
Theorem} The result follows by multiplying each side through by |n|?. 


Exercise 11.5.1. Let a be an algebraic number which is a root of f(t) € Z[t], a polynomial of 
degree d. Let F(x,y) = y%f(x/y), and suppose that there exists a constant « > 0 such that 
|F(m,n)| > «|n|2-?-€ for all integers m,n. Deduce that there exists a constant c > 0 such that 
la —m/n| > c/n?+€ for all integers m,n 4 0. (Thus Corollary [5-1] is “equivalent” to Roth’s 
Theorem.) 


We are going to move to what seems to be a rather different question but will 
eventually tie in closely with Corollary [1.5.1] We study pairwise coprime, positive 
integer solutions to the equation 


a+b=c, 


bounding the size of a, b, and c in terms of the product of the distinct primes that 
divide a, 6, and c: 


Conjecture 11.1 (The abc-conjecture). Fixe > 0. There exists a constant ke > 0 
such that if a and b are coprime positive integers with c=a-+, then 


II p> Kec €. 


p prime 
p divides abc 


This is the abc-conjecture, one of the great open questions of modern mathe- 
matics. 

For example, if we have a putative solution to Fermat’s Last Theorem, like 
a” +y" = 2” with x,y,z > 0, then we take a = 2”, b = y”, and c = z”. Now 
the product of the primes dividing abc = (xyz)” is the same as the product of the 
primes dividing xyz. Therefore the abc-conjecture with « = 1/5 implies for n > 5 


that 
K(2")4/5 Z II p= II p<xyz< 2 = gone 
p prime p prime 
p divides 2” y"z" p divides ryz 


where & = k1/5, from which we deduce 2” < 1/ &°. Since 2”, y” < 2” we deduce, 
from the abc-conjecture, that in every solution to x” + y” = z” with n > 5, the 
numbers x”, y”, and z” are all bounded by some absolute constant, and therefore 


8 Vet again this seems like a lot of notation for a constant, especially an inexplicit constant, but the 
notation reflects what the constant depends on, and given the complicated derivation of this constant, 
it is certainly simpler not to try to be explicit about it. 
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there are only finitely many solutions. Therefore we have proved that the abc- 
conjecture implies that there are only finitely many solutions to 2” + y” = z” with 
(x,y) =landn> 4. 

One can compare the abc-conjecture with the abc-theorem for polynomials (as 
in section [6.7] of appendix 6A). The size of the integers replaces the degrees of the 
polynomials; the prime divisors replace the irreducible polynomial factors. One 
cannot prove the abc-conjecture in the same way, since we relied heavily in our 
proof of the abc-theorem for polynomials on calculus, for which there is no analogy 
for numbers. 


We now state a conjecture which implies both the abc-conjecture and Corollary 


11.5.1)of Roth’s Theorem: 


Conjecture 11.2 (The abc-Roth conjecture). Let F(x,y) € Z[x,y]| be a homoge- 
nous polynomial of degree d, with no repeated linear factors. For each € > 0 there 
exists a constant Kp. > 0 such that for any coprime positive integers m,n, either 


F (m,n) =0 or 
| ee ee a 


p prime 
p divides F(m,n) 


The abc-Roth conjecture implies both Corollary since the product of 
the primes dividing non-zero F(m,n) is < |F'(m,n)|, and the abc-conjecture, tak- 
ing F(a, y) = xy(a+y) (since then F(a, b) = abc when a+b = c). Quite remarkably 
Conjecture [I1.2] follows from the abc-conjecture using some clever algebraic geom- 
etry. (See [2].) 


Further reading for this chapter 
1] Edward B. Burger, Diophantine olympics and world champions: Polynomials and primes down 
under, Amer. Math. Monthly 107 (2000), 822-829. 


2] Andrew Granville and Thomas J. Tucker, It’s as easy as abc, Notices Amer. Math. Soc 49 (2002), 
1224-1231. 


3] Serge Lang, Old and new conjectured Diophantine inequalities, Bull. Amer. Math. Soc. 23 (1990), 
37-75. 


4] H. W., Lenstra, Jr., Solving the Pell equation, Notices Amer. Math. Soc. 49 (2002), 182-192. 


5] Barry Mazur, Questions about powers of numbers, Notices Amer. Math. Soc. 47 (2000), no. 2, 
195-202. 


Additional exercises 


Exercise 11.6.1. Suppose (p,q) = 1 and q > 1. Determine all rationals m/n for which |2 - m| = 
1 


qn’ 


Exercise 11.6.2. Reprove exercise[7.10.21(a) using ([1.0-1). 


Exercise 11.6.3.1 Prove that there are infinitely many solutions to the Pell equation u?—dv? = 1 
with wu = 1 (mod d). 


Exercise 11.6.4. Prove that if a is transcendental, then so is a® for every non-zero integer k. 
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Exercise 11.6.5 (The “three gaps” theorem). Given a € R \ Q, we put the fractional parts 
{a}, {2a},...,{Na} © [0,1) in ascending order as 0 < faia} < {aga} <--- < {aya} < 1 
(so that {a1,...,ay} is a reordering of {1,...,.N}). We will prove that there are at most three 
distinct values in the set of consecutive differences, D(A) := {{aj41a}—{aja}: 7 =1,...,N—1}. 
(a) Show that if {(a;41 — l)a} — {(a; — 1)a} ¢ D(A), then either a; = 1 or aj41 = 1, or there 
exists k such that {(aj; — l)a} < {a,a} < {(aj41 — 1a}. 
(b) Show that if {(a; — 1)a} < {a,a} < {(aj41 — l)a}, then a, = N. 
(c) Deduce from (a) and (b) that every element of D(A) equals one of {aja}, 1 — {ana}, or 
{a1a}+1- {ana}. 


Exercise 11.6.6. Suppose that a and b are given integers, with 3 { a. 
(a) Show that we can select a congruence class r (mod 3) such that if integer m = r (mod 3), 
then «+ yV3 = (2+ V3)™(a 4+ bV3), then 3 divides y. 
(b) Deduce that if integer N can be written in the form a? — 3b? where 3 { N, then there are 
infinitely many pairs of powerful numbers that differ by exactly N. 


Exercise 11.6.7. Find an explicit value that can be used for cq in Liouville’s Theorem when 
a=vD where D > 1 is a squarefree positive integer. 


Exercise 11.6.8. Fix « > 0, and integers ao,...,a@g. Deduce from Roth’s Theorem that there 
are only finitely many pairs of coprime integers m,n for which |agn? + ain@-!m+4+---+agm4| < 
max{|m|, |n|}4~-?7¢. 

Exercise 11.6.9. Assume the abc-conjecture to show that there are only finitely many sets of 
integers x,y > 0 and p,q > 1 for which 2? — yf = 1. 


Exercise 11.6.10. Suppose that x? + y? = z” with x,y,z pairwise coprime and ‘ + + + 2 <i, 
(a) Prove that 7 + i pie 


r — 42° 
(b) Assume the abc-conjecture. Prove that there exists a constant B for which |x?|, |y?|,|z"| < 
B. 
Exercise 11.6.11. The abc-conjecture is “best possible” in that one cannot take « = 0. To 


establish this, we need to find examples of solutions to a + 6 = c in which (1/c)[]pjan-P gets 
arbitrarily small. 

(a) Prove that if m?|b, then TI pj. P < 6/m. 

(b) Prove that for any odd integer m there exists an integer n for which 2” = 1 (mod m2). 
(c)t Combine these two observations to show that for any € > 0 there exist coprime integers 


a+b=c for which ]],)q4-P < €c 


Appendix 11A. Uniform 
distribution 


11.7. na mod 1 


Dirichlet’s Theorem, in section implies that na mod 1 gets arbitrarily close 
to 0 as n runs through a sequence of integers n. One might also ask whether na 
mod 1 gets arbitrarily close to any given 6 € (0,1). 


Theorem 11.5 (Kronecker’s Theorem). Jf a is a real irrational number, then the 
numbers {na} are dense on [0,1). 


Proof. Fix « > 0. By Dirichlet’s Theorem there exists an integer n with ||nal| < e, 
where ||t|| is the distance from ¢ to the nearest integer. As a is irrational we also 
have that ||na|| 4 0, and so {na} € (0,€) or {na} € (1—e,1). We will assume that 
{na} € (0,€) (the case with {na} € (1 — €,1) being proved analogously). 

Let 6 = {na} € (0,€). Select D to be the largest integer < 1/6 and so 


{na}, {2na},...,{Dna} = 6,26,..., Dd 


is a set of points in [0,1), consecutive points being spaced 6 < € apart. Therefore 
if 0 € [0,1), then we let k = [6/6] and so 0 — ké € [0,5), which implies that 


6 — {kna} = 0— k{na} = 6 — kd € [0,6) C [0,€). 


That is, there are integer multiples of a in R/Z that are arbitrarily close to 6. 


Exercise 11.7.1. Show that the conclusion of the theorem is not true if @ is rational. 


Exercise 11.7.2. Prove Kronecker’s Theorem when na (mod 1) € (1 —.,1). 


Now we know that if a is irrational, then na mod 1 gets arbitrarily close to any 
given 6 € [0,1), we might ask how often na mod 1 gets close to each 6 € [0,1). Are 
the values of na mod 1 roughly equidistributed? To answer this question we must 
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determine how often {na} € [@ — «,@+ | for 0 € (0,1) and sufficiently small € > 0. 
If the numbers {na} are equidistributed, then we might expect the frequency to 
be roughly proportional to the length of the interval. The analogous question can 
be asked for any sequence of numbers 21, 22,... € [0,1). We say that {zn }n>1 is 
uniformly distributed mod 1 (or equidistributed mod 1) if for any a < b € [0,1), 


lim ear <N:a< 2p < b} exists and equals b— a. 
N-0co N 

The values of x (mod 1) are in 1-to-1 correspondence with the values of e(x) 
(where e(t) := e?’**) as its value depends on x (mod 1) and not on x. Moreover 
the values e(ka) for any given integer k # 0 remain consistent for x with any given 
value mod 1. That is, if = m+06 with 0 < 6 < 1, then kx = km +k so that 
{kx} = {kd}. This suggests that to study a sequence of values x, mod 1, we might 
use Fourier analysis. This thinking leads to the famous theorem of Hermann Weyl 
(for more on this, including the proof, see ): 


Theorem 11.6 (Weyl’s uniform distribution theorem). The sequence {tn}n>1 is 
uniformly distributed mod 1 if and only if for all non-zero integers k we have 


N 
1 
lim — S- e(kx,,) exists and equals 0. 


N- oo 
n=1 


Exercise 11.7.3. (a) Show that ye e({na}) = oe) ae we ¢ Z, and then deduce that 


1—e(-a) 
N 1 
| oar e({ra})| < jane" 
b) Use Weyl’s uniform distribution theorem to deduce that if a is a real, irrational number, 
y: 


then {na},>1 is uniformly distributed mod 1. 


One can prove that {na} is uniformly distributed mod 1 using fairly elementary 
ideas though it is not easy: 


Exercise 11.7.4. Let x1,22,... € [0,1) be a sequence of numbers. Suppose that there are 
arbitrarily large integers M for which 


1 m m+1 
li — n<N:—< < 
im s#{ns Mu In 


1 
\ exists and equals —, 
M 


for0<m< M-—1. Deduce that {#7}, >1 is uniformly distributed mod 1. 


Exercise 11.7.5.4 Let a be a real, irrational number. In this exercise we sketch a proof that 
{na},>1 is uniformly distributed mod 1. Fix € > 0 arbitrarily small. 
(a) Use Kronecker’s Theorem to show that there exists an integer N > 1 such that {Na} =6€ 
(0, €). 
(b) Prove that if {na} < 1—64, then {(n+ N)a} = {na}+ 6. What if {na} > 1-6? 
(c) Suppose that 0 < t < 1— 26. Show that {na} € [t,t + 6] if and only if {(n + N)a} € 
[t + 6,t + 26], and so deduce that 


l#{l<n<a:t<{na}<t+6}-#{l<n<a:t+6< {na} <t+26}|<N. 


Now let 6 = 1/M for some large integer M. 
(d)? Use (c) to show that if 0<m < M—1, then 


x 


M 
(e) Deduce that {na},,>1 is uniformly distributed mod 1 using exercise[11.7.4 


m m+1 
le {1<nse:% < {na} < Mu \ < MN. 
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Kronecker’s Theorem in n dimensions. In exercise [11.1.2] we saw that Dirich- 


let’s Theorem may be generalized to k dimensions; that is, given aj,...,a, € R, 
for any € > 0 there exist infinitely many integers n such that each |/na,|| < ¢. To 
generalize Kronecker’s Theorem we would like that for 01,...,8, © R there are 


infinitely many n for which each ||na,; — 6;|| < €. However this is not true in all 
cases, even when k& = 1: In the hypothesis of Theorem [LL.5| we needed that a is 
irrational, and we showed that this is necessary in exercise [1.7.1] Another way to 
state that a is irrational is to insist that 1 and a are linearly independent over Z. 


In two dimensions we find another obstruction: Suppose that a; = a@ and 
ag = 1—-a. If ||na; — 6;|| < € for each 7, then 


[91 + 42|| = In — A1 — || < ||na1 — A1|| + [|n@2 — A9|| < 2e. 


But this should hold for any € > 0 which implies that 0; + 62 is an integer. Notice 
that in this example 1, a 1, a2 are not linearly independent over Z. 


Exercise 11.7.6. Let ai,...,a%,01,...,0% € R be given, and assume that there are integers 
co,---,¢e for which co + c1a1 +---+ chaz = 0. Suppose that for all « > 0 there are infinitely 
many n for which ||na; — ;|| < € for 7 = 1,2,...,k. Prove that c10, +---+ cpO, € Z. 


These are the only obstructions to the generalization: 


Theorem 11.7 (Kronecker’s Theorem in n dimensions). Assume that the real 
numbers 1,01,...,@,% are linearly independent over Z. Then the points 


(nay,.--;NQK)n>1 are dense in (R/Z)*. 


In other words, for any given 61,...,0% € R and any € > 0 there are infinitely many 
integers n for which ||na; — || < € for allj =1,...,k. 


This can be proved in several different ways that are accessible though tough. 
We refer the reader to sections 23.5—23.8 of |H'W08). 


11.8. Bouncing billiard balls 


Billiards, snooker, and pool are all played on a rectangular table, hitting the ball 
along the surface. The sides of the table are cushioned so that the ball bounces off 
the side at the opposite angle to which it hits. That is, if it hits at angle a°, then 
it bounces off at angle (180 — a)°. Sometimes one miscues and the ball carries on 
around the table, coming to a stop without hitting another ball. Have you ever 
wondered what would happen if there were no friction, so that the ball never stops? 
Would your ball eventually hit the ball it is supposed to hit, no matter where that 
other ball is placed? Or could it go on bouncing forever without ever getting to 
the other ball? We could rephrase this question more mathematically by supposing 
that we play on a table in the complex plane, with two sides along the x- and 
y-axes. Say the table length is @ and width is w so that it is the rectangle with 
corners at (0,0), (0,2), (w,0), (w,@). Let us suppose that the ball is hit from the 
point (u,v) along a line with slope a (that is, at an angle a from the horizontal). 
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As the line continues on indefinitely inside the box, does it get arbitrarily close to 
every point inside the box? 


Exercise 11.8.1. Show that by rescaling with the map x > 2/f, y > y/w we can assume, 
without any loss of generality, that the billiards table is the unit square. 


As a consequence of exercise [I1.8.1] we may henceforth assume that w = @ = 1. 

The ball would run along the line £ := {(u+t,v+at), t > 0} if it did not 
hit the sides of the table. Notice though that if after each time it hit a side, we 
reflected the true trajectory through the line that represents that side, then indeed 
the ball’s trajectory would be L. 


Figure 11.1. Billiards on the complex plane and on the unit square. Follow- 
ing a path inside the fundamental domain of a lattice: The path segment ¢; 
gets mapped to 2; for j = 2,...,6. 


224 Appendix 11A. Uniform distribution 


Develop this to prove: 


Exercise 11.8.2. Show that the billiard ball is at (x,y) after time t, where x and y are given as 
follows: 

Let m = [u +t] . If m is even, let x = {u+t}; if m is odd, let r =1— {u+ t}. 

Let n = [v + at]. If n is even, let y = {v + at}; if n is odd, let y=1-—{v+ at}. 

Exercise 11.8.3. Show that if a is rational, then the ball eventually ends up exactly where it 
started from, and so it does not get arbitrarily close to every point on the table. 


So how close does the trajectory get to the point (r,s), where r,s € [0,1)? Let 
us consider all of those values of t for which xz = r, with m and n even to simplify 
matters (with m and n as in exercise [11.8.2), and see if we can determine whether 
y is ever close to s. 


Exercise 11.8.4. Show that [z] is even if and only if {z/2} € [0,1/2). Deduce that [z] is even 
and {z} =r if and only if {2/2} = r/2. 


Hence we want (u+t)/2 = k+r/2 for some integer k; that is, t = 2k+(r—u), k € 
Z. In that case v + at = 2ak + a(r — u) + v so we want {ak + (a(r — u) + v)/2} 
close to s/2. That is, ka mod 1 should be close to @ := {@oetatuen)y Now, 
in Kronecker’s Theorem (Theorem [[1.5) we showed that the values ka mod 1 are 
dense in [0,1) when a is irrational, and so in particular there are values of k that 
allow ka mod 1 to be arbitrarily close to 8. Hence we have proved the difficult part 
of the following corollary: 


Corollary 11.8.1. If a is a real irrational number, then any ball moving at angle 
a (to the coordinate axes) will eventually get arbitrarily close to any point on a 
1-by-1 billiards table. 


We finish with a challenge question to develop a similar theory of billiards 
played on a circular table! 


Exercise 11.8.5. Imagine a trajectory inside the unit circle. A ball is hit and continues indefi- 
nitely. When it hits a side at angle 6 (compared to the normal line at that point), it bounces off 
at angle —0. 

(a) Suppose that the first two points at which the ball hits the edge are at e(@) and then at 
e(8 +a). Show that the ball hits the edge at e(3 + na) for n = 0,1,2,.... 

(b) Prove that the ball falls into a repeated trajectory if and only if a is rational. 

(c) Show that if a is irrational, then the points at which the ball hits the circle edge are dense 
(ie., eventually the ball comes arbitrarily close to any point on the edge) but that it never 
hits the same edge point twice. 

(d) Prove that the ball’s trajectory never comes inside the circle of radius | cos(a/2)|. Deduce 
that the trajectory of the ball is never dense inside the unit circle. 

(e) Prove that if a is irrational, then the trajectory of the ball is dense inside the ring between 
the circle of radius | cos(a/2)| and the circle of radius 1. (The technical word for a ring is 
an annulus.) 


Appendices. The extended version of chapter 11 has the following additional 
appendices: 


Appendix 11B. Continued fractions introduces and analyzes continued fractions 
for all real numbers, focusing on continued fractions for quadratic irrationals. We 
find and justify a particularly efficient algorithm for finding all the solutions to 
Pell’s equation using continued fractions. 


11.8. Bouncing billiard balls 225 


Appendix 11C. Two-variable quadratic equations establishes that, other than 
in certain special cases, if there is one solution to a given two-variable quadratic 
equation, then there are infinitely many. 

Appendix 11D. Transcendental numbers discusses how many transcendental 
numbers there are, via Cantor’s diagonalization argument. We show that e and 7 
are irrational and then discuss “normal numbers” . 


OOO 
Chapter 12 


Binary quadratic forms 


Let a, b, and c be given integers. We saw in Corollary [3.1] that the integers that 
can be represented by the binary linear form az + by are those integers divisible by 
gcd(a, b). We are now interested in what integers can be represented by the binary 
quadratic form[] 

f(a, y) = ax? + bry + cy’. 
As in the linear case, we can immediately reduce our considerations to the case 
that gcd(a, b,c) = 1. 

The first important result of this type was given by Fermat for the particular 
example f(x,y) = x? + y”, as discussed in section [9.1] The two main results were 
that an odd prime p can be represented by f(x,y) if and only if p = 1 (mod 4), 
and that the product of two integers that can be written as the sum of two squares 
can also be written as the sum of two squares, a consequence of the identity (9.1.1). 
One can combine these two facts to classify exactly which integers are represented 
by the binary quadratic form «x? + y?. 


At first sight it looks like it might be difficult to work with the example f(x,y) = 
x?+20ry+101ly?. However, this can be rewritten as (x+10y)?+y? and so represents 
exactly the same integers as g(x,y) = 2? + y?. In other words 


n= f(u,v) if and only if n = g(r,s), where (") = & ’) @ . 


VU 


This 2-by-2 matrix is invertible over the integers, so we can express u and vu 
as integer linear combinations of r and s. Thus every representation of n by f 
corresponds to one by g, and vice versa, a 1-to-1 correspondence, obtained using 
the invertible linear transformation u,v > u+10v,v. Such a pair of quadratic 
forms, f and g, are said to be equivalent; and we have just seen how equivalent 
binary quadratic forms represent exactly the same integers. The discriminant of 


14Binary” as in the two variables x and y, and “quadratic” as in degree two. The monomials 
ax”, bry, cy” each have degree two, since the degree of a term is given by the degree in x plus the degree 
in y. 
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ax? +bery+cy? is b?—4ac. We will show that equivalent binary quadratic forms have 
the same discriminant, so that it is an invariant of the equivalence class of binary 
quadratic forms. All of this will be discussed in this chapter and, in appendix 12A, 
we will study generalizations of the identity (9.1.1). 


12.1. Representation of integers by binary quadratic forms 


An integer N is represented by f if there exist integers m,n for which N = f(m,n), 
and N is properly represented if (m,n) = 1 (see exercise3.9.1a]for the same question 
for linear forms). 


Exercise 12.1.1. Prove that if N is squarefree, then all representations of N are proper. 


What integers can be properly represented by ax? + bry + cy”? That is, for 
what integers N do there exist coprime integers m,n such that 


(12.1.1) N =am? + bmn + cn?? 


We may reduce to the case that gcd(a, b,c) = 1 by dividing though by gcd(a, b,c). 
(If gced(a, b,c) = 1, then f is a primitive binary quadratic form.) One idea is to 
complete the square to obtain 


(12.1.2) 4aN = (2am + bn)? — dn? 


where the discriminant d := b? — 4dac. This implies that the discriminant always 
satisfies 
d=Oorl (mod 4). 


There is always at least one binary quadratic form of discriminant d, for such 
d, which we call the principal form: 


a? — (d/4)y? when d=0 (mod 4), 
a +ay+4%y? whend=1 (mod 4). 

We call d a fundamental discriminant if d = D = 1 (mod 4), or d = 4D with 

D =2 or 3 (mod 4), and if D = d/(d,4) is squarefree. These are precisely the dis- 

criminants for which every binary quadratic form is primitive (see exercise 12.1.3). 

We met this notion already in exercise|8.16.4] of appendix 8D, when classifying the 

genuinely different Jacobi symbols. 


When d < 0 the right side of (12.1.2) can only take positive values, which makes 
our discussion easier than when d > 0. For this reason we will restrict ourselves to 
the case d < 0 here and revisit the case d > 0 in appendix 12C. If d< 0 anda < 0, 
we replace a,b,c by —a,—b,—c, so as to ensure that am? + bmn + cn? is always 
> 0; in this case, we call ax? + bry + cy” a positive definite binary quadratic form. 


At the start of this chapter we worked through one example of equivalence of 
binary quadratic forms, and here is another: The binary quadratic form x? + y? 
represents the same integers as X?+2XY 4+ 2Y°, for if N = m?+n?, then N = 
(m—n)?+2(m—n)n+2n?, and similarly if N = u?+2uv+2v?, then N = (ut+v)?+v?. 
The reason is that the substitution 


(7) = (y) were ar= (5 1) 
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transforms x7 +4? into X?+2XY+2Y%, and the transformation is invertible, since 
det M = 1. We therefore say that 2? + y? and X? +2XY + 2Y are equivalent 
which we denote by 

ety? nw X*42XY + 2Y?. 


Much more generally define 


su(2,z)={(° A : a, B,7,6 € Z and ad — by=1}. 


We can represent the binary quadratic form as 


av? + bey + cy?=(x y) Gs 12) er 


a 


b/2 


()-m Gea Dewan 


2 2 Tv a b/2 xX 
AX? + BXY+CY?=(X Y)M a Pa (F): 


Its discriminant is —4 times the determinant of ( ss a We deduce that if 


then 


so that 


(12.1.3) & wy = M™ G ) M, 


which yields the somewhat painful looking explicit formulas 


A = f(a,7) = aa? + bay+t cy”, 
(12.1.4) B = 2aBa+t (ad + By)b + 2y6e, 
C = f(8,6) = a6? + bG6 + cé?. 


When working with binary quadratic forms it is convenient to represent ax? + 
bry + cy? by the notation [a,b,c]. We have just proven the following. 


Proposition 12.1.1. If f = [a,b,c] ~ F = [A,B,C], then there exist integers 
a, B,7,6 with ad — By =1 for which A= f(a,7) and C = f(6,6). Moreover f and 
F represent the same integers, and there is a 1-to-1 correspondence between their 
representations and proper representations of a given integer. 


Exercise 12.1.2. (a) Suppose that d is a fundamental discriminant. Prove that the character 
(d/-) has conductor dividing d. 
(b) Prove that for any non-zero integer d, the character (d/-) has conductor that divides 4d. 


The conductor of f(-) is the minimum p > 0 such that f(n +p) = f(n) for all integers n. 


Exercise 12.1.3. Suppose that d = 0 or 1 (mod 4). Show that every binary quadratic form of 
discriminant d is primitive if and only if d is a fundamental discriminant. 


Exercise 12.1.4. (a) Show that if d < 0, then am? + bmn + cn? has the same sign as a, no 
matter what the choices of integers m and n. 
(b) Show that if ax? + bry + cy? is positive definite, then a,c > 0. 
(c) Show that if d > 0, then am? + bmn + cn? can take both positive and negative values, by 
making explicit choices of integers m,n. 
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Exercise 12.1.5. Use (12.1.3) to show that two equivalent binary quadratic forms have the same 
discriminant. 


Exercise 12.1.6. Show that the principal form is equivalent to every binary quadratic form 
x? + bry + cy? with leading coefficient 1, up to equivalence. 


Exercise 12.1.7. In each part, determine whether the two binary quadratic forms are equivalent. 
If so, make the equivalence explicit; if not, explain why not. 

(a) y? +ay + 4a? and 2? — 5ary + 10y?. 

(b) a? + 3ay + 5y? and 3x? — dary 4 11y?. 


12.2. Equivalence classes of binary quadratic forms 


In this section we will develop an algorithm that will allow us to show, for example, 
that 29X? + 82XY +58Y° is equivalent to 7? + y?. We do this as it is surely more 
intuitive to work with the latter form rather than the former. Gauss observed that 
every equivalence class of binary quadratic forms (with d < 0) contains a unique 
smallest representative, called the reduced representative, which we now prove: 
The quadratic form ax? + bry + cy? with discriminant d < 0 is reduced if 


—a<b<a<cand b> 0 whenever a=c. 


Theorem 12.1. Every positive definite binary quadratic form is equivalent to a 
reduced form. 


Proof. We will define a sequence of properly equivalent forms; the algorithm ter- 
minates when we reach one that is reduced. Given a form [a,b,c], we use one of 
three transformations, described in terms of matrices from SL(2, Z): 


(i) If c <a, the transformation 


a\ (0 -1\ (X 

y}~\1 o}ly 
yields the form [c,—b,a] which is properly equivalent to [a,b,c] (as ax? + 
bry + cy? = a(-Y)? + b(-Y)(X) + c(X)? = cX? — bDXY + aY7). Hence 
A=c<a=C. 


(ii) If b > a or b < —a, then select B to be the absolutely least residue of 6 
(mod 2a), so that -a << B <a, say B = b—2ka. The transformation matrix 


will be 
w\ (1 —-k\ (XxX 
y) \O 1 Y)° 
The resulting form [A,B,C] with A = a is properly equivalent to [a, b, cl, 
where —-A< B< A. 
(iii) If c=a and —a < b < 0, then the transformation 


r\ (0 —1\ (XxX 
y/) \1 0 Y 
yields the form [A, B, A] with A =a and B = —b, so thatO0< B<A. 


If the resulting form is not reduced, then repeat the algorithm. If none of these 
hypotheses holds, then one can easily verify that the form is reduced. To prove that 
the algorithm terminates in finitely many steps we follow the leading coefficient 


12.3. Congruence restrictions on the values of a binary quadratic form 231 


a: a starts as a positive integer. Each transformation of type (i) reduces the size 
of a. It stays the same after transformations of type (ii) or (iii), but after a type 
(iii) transformation the algorithm terminates, and after a type (ii) transformation 
we either have another type (i) transformation or else the algorithm stops after at 
most one more transformation. Hence the algorithm finishes in no more than 2a+1 
steps. 


Examples. Applying the reduction algorithm to the form [76, 217, 155] of discrim- 
inant —31, one finds the sequence of forms 


(76, 65, 14], [14, —65, 76], [14, —9, 2], [2, 9, 14], [2, 1, 4], 


the sought-after reduced form. Similarly the form [11, 49,55] of discriminant —19 
gives the sequence of forms [11, 5, 1], [1, —5, 11], [1,1, 5]. 


This proof of Theorem [12.1] can be rephrased to prove Theorem [1.2] of section 
[[.10] (of appendix 1A), that every matrix in SL(2,Z) can be represented as the 
product of powers of the matrices S = ({ i) and T = (° 3): The matrices 
0 


used in the transformations in the proof of Theorem |[L2.1]are (; 


1 —k = —k 
Gree 


The very precise conditions in the definition of “reduced” were chosen so that 
every positive definite binary quadratic form is properly equivalent to a unique 
reduced form. The key to proving uniqueness is exercise {12.6.1} the (messy) details 
are completed in exercise [12.6.2] 


= ae 
ee and 


12.3. Congruence restrictions on the values of a binary quadratic form 


What restrictions are there on the values that can be taken by a binary quadratic 
form (in analogy to Theorem [9.2)? 


Proposition 12.3.1. Let d = b? — 4ac where (a,b,c) = 1. 


(i) If integer N is properly represented by ax? + bry +cy”, then d is a square mod 
AN. 

(ii) Ifd is a square mod 4N, then there exists a binary quadratic form of discrim- 
inant d that properly represents N. 


Proof. (ii) If d = 6? (mod 4N), then d = b? — 4Nc for some integer c, and so 
Na? + bry + cy? is a quadratic form of discriminant d which represents N = 
N-17+b-1-0+c-0?. 


(i) Suppose that N = am? + bmn +4 cn? with (m,n) = 1. Then (2am + bn)? — 
dn? = 4aN so that dn? = (2am+ bn)? (mod 4N); that is, dn? is a square mod 4N 
and, analogously, dm? is a square mod 4N. Now if p is a prime such that p*||4N, 
then p does not divide at least one of m and n, as (m,n) = 1. We deduce that d is 
a square mod p* from the fact that dn? is a square mod p* if p does not divide n, 
and from the fact that dm? is a square mod p* if p does not divide m. The result, 
that dis a square mod 4N now follows from the Chinese Remainder Theorem. 
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For a given odd prime p, Proposition[2.3.1]tells us that p is represented by some 
binary quadratic form of discriminant d if and only if (d/p) = 1 or 0. However it does 
not tell us which binary quadratic form. In section [9.6] we could not immediately 
determine which of the two reduced binary quadratic forms of discriminant —20, 
namely x? + 5y? and 2x? + 2xry + 3y”, represents which primes p with (—20/p) = 1. 
There we found we could distinguish which prime was represented by which form 
by also studying the values of (p/d). We now see how this works out in general. 


We can appeal to Corollary [9-4-1] to restrict the possibilities for the binary 
quadratic forms of discriminant d that represent N. Given a primitive binary 
quadratic form f of discriminant d we define, for each odd prime p dividing d, 


of(p) = (2) if p does not divide a, and o(p) = (<) if p does divide a. 


If p divides a, then p divides d+ 4ac = b? and so divides b, and therefore cannot 
divide c as f is primitive. Therefore o+(p) equals 1 or —1 for each such p. 


Exercise 12.3.1.' Prove that if f ~ g, then o+(p) = og(p) for all odd primes p dividing d. 


Corollary 12.3.1. Suppose that d is a fundamental discriminant and that N is a 
squarefree integer for which (N,d) =1. If d is a square mod 4N, then there exists 
a binary quadratic form f of discriminant d that properly represents N such that 


os(p) = (2) for every odd prime p dividing d. 


Proof. There exists a binary quadratic form f of discriminant d that properly 
represents N, by Proposition [[2.3.1{ii). Therefore N is represented by inserting 


rationals into f and this happens, by Corollary [9.4.1] if and only if (2) = o;(p) 
for every odd prime p dividing d. 


When d = —20 we have of (5) = 1 for f = x? + 5y? and of(5) = (2/5) = -1 
for f = 2x? + 2xry + 3y?. This can certainly settle such issues in several cases. 

There are three reduced quadratic forms [1, 1, 6], [2,£1,3] with d = —23. How- 
ever 0 (23) = 1 for each of these, so this does not help us to distinguish between the 
integers represented by these quadratic forms. This case is much more complicated 
and beyond the scope of this book. 


We develop these ideas further in section [12.11] of appendix 12B. 


Exercise 12.3.2. Prove that if p1,...,p, are distinct primes that are each represented by some 
form of discriminant d, then pi --- pz is also represented by some form of discriminant d. 


12.4. Class numbers 


Theorem 12.2. Ifd < 0, then there are only finitely many reduced binary quadratic 
forms of discriminant d. 


Proof. For a reduced binary quadratic form, |d| = 4ac — (|b|)? > 4a-a— a? = 3a? 
and so a is a positive integer for which 


as vd\/3. 
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Therefore for a given d < 0 there are only finitely many a, and so 6 (as |b] < a), 
but then c = (b? —d)/4a is determined, and so there are only finitely many reduced 
binary quadratic forms of discriminant d. 


Let h(d) denote the class number, the number of equivalence classes of binary 
quadratic forms of discriminant d. We have just shown h(d) is finite, and the proof 
of Theorem [12.2] even describes an algorithm to easily find all the reduced binary 
quadratic forms of a given discriminant d < 0. In fact h(d) > 1 since we always 
have the principal form. If h(d) = 1, then all binary quadratic forms are equivalent 
to the principal form. 


Example. If d = —163, then |b| < a < ./163/3 < 8. But b is odd, since b = b? = 
d+4ac = d (mod 2), so |b| = 1,3, 5, or 7. Therefore ac = (b? + 163)/4 = 41, 43, 47, 
or 53, a prime, with 0 < a < cand hence a = 1. Since b is odd and —a < b <a, we 
deduce that b = 1 and so c = 41. Hence x? + xy + 41y? is the only reduced binary 
quadratic form of discriminant —163, and therefore h(—163) = 1. 


Exercise 12.4.1. Determine all of the reduced binary quadratic forms of discriminant d for 
—20 < d< —1 as well as for d = —28, —43, —67, —167, and —171. 


Exercise 12.4.2. Determine all of the reduced binary quadratic forms of discriminant d for 
d = —3, —15, —23, —39, —47, —87, —71, and —95. 


Exercise 12.4.3. Determine all of the reduced binary quadratic forms of discriminant d for 
d= —4, —20, —56, and —104. 


Exercise 12.4.4. Prove that if ax? + bry + cy? is a reduced binary quadratic of discriminant 
d <0, then |c| > \/|d|/2. 
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Corollary 12.5.1. Suppose that h(d) =1. Then N is properly represented by the 
form of discriminant d if and only if d is a square mod 4N. 


Proof. This follows immediately from Proposition [[2.3-]] since there is just one 
equivalence class of quadratic forms of discriminant d, and forms in the same equiv- 
alence class represent the same integers by Proposition [12.1.1] 


We have h(—4) = 1 and so Corollary [[2.5.1]implies that integer N is properly 
represented by x? + y? if and only if —4 is a square mod 4N. This is more or less 
Theorem [9.2] (and can be deduced from its proof). 

In the example in section we showed that 2? + ry + 41y? is the only 
binary quadratic form of discriminant —163. This implies, by Corollary [12.5.1 
that if prime p 4 2 or 163, then it can be represented by the binary quadratic form 
x? + ry + 41y? if and only if (—163/p) = 1. 

In exercise [12.4.1] we exhibited nine fundamental discriminants d < 0 with 
h(d) = 1, namely d = —3, —4,—7, —8,—11, —19, —43, —67, as well as —163. It 
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turns out these are the only ones with class number oneF Therefore, as in the 
example above, if pf 2d, then 
p is represented by x? + y? if and only if (—1/p) = 1; 
p is represented by x? + 2y? if and only if (—2/p) = 1; 
p is represented by 2? + xy + y? if and only if (—3/p) = 1; 
p is represented by x? + xy + 2y? if and only if (—7/p) = 1; 
p is represented by x? + xy + 3y? if and only if (—11/p) = 1; 
p is represented by x? + ry + 5y? if and only if (—19/p) = 1; 
p is represented by x? + xy + lly? if and only if (—43/p) = 1; 
p is represented by x? + xy + 17y? if and only if (—67/p) = 1; 
p is represented by x? + xy + 41y? if and only if (—163/p) = 1. 


Euler noticed that the polynomial x? + x + 41 is prime for x = 0,1,2,...,39, and 
similarly the other polynomials above. Rabinowiscz proved that this is an “if and 
only if” condition: 


Theorem 12.3 (Rabinowiscz’s criterion). We have h(1—4A) =1 for A > 2 if and 
only if x? +"+ A is prime forn =0,1,2,...,A—2. 


At n = A—1 the polynomial takes value (A — 1)? + (A —1)+A = A? which 
is composite. We will prove Rabinowiscz’s criterion below. 


What about when the class number is not one? In the example with d = —20 
we have h(—20) = 2; the two reduced forms are x? + 5y? and 2x? + 2xy + 3y?. By 
Proposition [12.3.1] p is represented by at least one of these two forms if and only 
if (—5/p) = 0 or 1, that is, if p = 1,3,7, or 9 (mod 20) or p = 2 or 5. Can we 
decide which of these primes are represented by which of the two forms? Note that 
if p = x? + 5y?, then (p/5) = 0 or 1 and so p= 5 or p = +1 (mod 5), and thus 
p = 1 or 9 (mod 20). If p = 22? + 2ay + 3y?, then 2p = (2x + y)? + 5y” and so 
p = 2 or (2p/5) = 1; that is, (p/5) = —1, and hence p = 3 or 7 (mod 20). Hence 
we have proved 


p is represented by 2? + 5y? if and only if p= 5, or p=1 or 9 (mod 20); 
p is represented by 27? + 27y+ 3y? if and only if p = 2, or p= 3 or 7 (mod 20). 


That is, we can distinguish which primes can be represented by which binary qua- 
dratic form of discriminant —20, through congruence conditions, despite the fact 
that the class number is not one. However we cannot always do this; that is, we 
cannot always distinguish which primes are represented by which binary quadratic 
form of discriminant d. It is understood how to recognize those discriminants d for 
which we can determine which binary quadratic forms of discriminant d represent 


?The proof that the above list gives all of the d < 0, for which h(d) = 1, has an interesting history. 
By 1934 it was known that there is no more than one further such d, but that putative d could not be 
ruled out by the method. In 1952, Kurt Heegner, a German school teacher proposed an extraordinary 
proof that there are no further d. At the time his paper was ignored since it was based on a result 
from an old book (of Weber) whose proof was known to be incomplete. In 1966 Alan Baker gave a 
very different (and more obviously correct) proof that this was the complete list of discriminants with 
class number one, and this was widely acknowledged to be correct. However, soon afterwards Stark 
realized that the proofs in Weber are easily corrected, so that Heegner’s work had been fundamentally 
correct. Heegner was subsequently given credit for solving this famous problem, but sadly only after he 
had died. Heegner’s paper contains a most extraordinary construction, widely regarded to be one of the 
most creative and influential in modern number theory. 
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which integers simply through congruence conditions (see section [12.11] of appendix 
12B). These idoneal numbers were recognized by Euler. He found 65 of them, and 
no more are known—it is an open conjecture as to whether Euler’s list is complete. 
It is known that there can be at most one further undiscovered idoneal number, but 
it seems unlikely whether the techniques used can rule out this putative example[}] 
Exercise 12.5.1. (a) Determine the two reduced binary quadratic forms of discriminant —15. 

(b) Determine which reduced residue classes can be represented by some form of discriminant 

—15? 
(c) Distinguish which primes are represented by which form (with proof). 


Proof of Rabinowiscz’s criterion. We begin by showing that f(n) := n?+n+A 
is composite for some integer n in the range 0 < n < A—2, if and only ifd=1-—4A 
is a square mod 4p for some prime p < A. For if n? ++ A is composite, let 
p be its smallest prime factor so that p < f(n)'/? < f(A—1)/? = A. Then 
(2n + 1)? —d = 4(n?+n+ A) =0 (mod 4p) so that d is a square mod 4p. On 
the other hand if d is a square mod 4p where p is a prime < A — 1, select m to be 
the smallest positive integer such that d = m? (mod 4p). Then m < 2p (or else 
replace m by 4p — m) and m is odd (as d is odd), so write m = 2n + 1 and then 
0<n<p-—1< A-—2 with d= (2n+1)? mod 4p. Therefore p divides n?++n+A 
with p< A= f(0) < f(n) so that n? +n+ A is composite. 

Now we show that h(d) > 1 if and only if d = 1 — 4A is a square mod 4p 
for some prime p < A. If h(d) > 1, then there exists a reduced binary quadratic 
ax? +bxry+cy? of discriminant d with 1 < a < ,/|d|/3 < A by the proof of Theorem 
If p is a prime factor of a, then p < a < A and d = b? — 4ac is a square mod 
4p. On the other hand if d is a square mod 4p for some prime p < A, and h(d) = 1, 
then p is represented by x? + ay + Ay? by Proposition [2.3-1ii). Now y 4 0 as p is 
not a square. Therefore 4p = (2x + y)? + |dly? > 0? + |d| - 1? = |d|; that is, p > A, 
a contradiction. (We will extend this proof to obtain more on the small values 
taken by any binary quadratic form of negative discriminant, in exercise[12.6.1(a).) 
Hence h(d) > 1. 

Putting these two results together, we deduce that h(d) > 1 if and only if 
f(n) := n? +n+ A is composite for some integer n in the range 0 < n < A-2, 
which implies Rabinowiscz’s criterion. 


Exercise 12.5.2.1 Prove that if n? + n+ A is prime for all integers n in the range 0 <n < B, 
where 1 < B < (A—1)/2, then (4) = —1 for all primes p < 2B +1. 


The class number one problem for even negative fundamental discriminants is 
not difficult: 


Theorem 12.4. [fh(d) =1 with d= —4n forn EN, thenn = 1,2,3,4, or 7. 


Proof. Suppose that h(—4n) = 1. Then n must be a prime power or else there 
exist coprime integers 1 < a < c for which ac = n and so |[a,0, c] is a non-principal 
reduced form of discriminant —4n. Moreover n + 1 must be an odd prime or a 
power of 2 or else there exist integers 1 < a < c with gced(a,2,c) = 1 for which 
ac =n+1 and so [a, 2,c] is a non-principal reduced form of discriminant —4n. 


’We therefore find ourselves in much the same situation as for class number one before Heegner’s 
work, as discussed in the last footnote. 
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One of n and n+ 1 is even and hence must be a power of 2 (from the previous 
paragraph). If n = 2" with k > 4, then we have the non-principal reduced form 
(4,4, 2*-?2 +1], and ifn+1 = 2" with k > 6, then we have the non-principal reduced 
form [8,6,2*-? + 1]. 

Therefore if h(—4n) = 1, then n = 1,2,4, or 8 or n+ 1 = 2,4,8, 16, or 32. We 
can rule out n = 15 (as 15 is composite) and n = 8 (as 9 is not an odd prime) and 
n = 31 (as [5,4, 7] is a non-principal reduced form of discriminant —124). We know 
that h(—4n) = 1 for n = 1,2,3,4, and 7 by exercise [12.4.1] 


These discriminants have a beautiful property. 


Corollary 12.5.2. Let n =1,2,3,4, or 7. If p is a prime that does not divide 4n, 


then p can be written as u? + nv? if and only if (=) =1. 


Proof. As we just discussed h(—4n) = 1, and so all binary quadratic forms of 
discriminant —4n are equivalent to x? + ny?. By Proposition [12.3.1] p can be 
represented by some form of discriminant —4n if and only if —4n is a square mod 
p, and the result follows. 


We had already discussed representations of p by x? + y?, 2? + 2y?, x? + 3y? 
in sections [9_JJand 9.2] and x? + 4y? = x? + (2y)? follows easily from x? + y?. This 
leaves only the most interesting of the cases of Corollary [12.5.2 


p=27+7y? if and only if p = 1,9, 11,15, 23, or 25 (mod 28). 


Exercise 12.5.3. Let q be a prime = —1 (mod 4). Prove that (2) = —1 for all primes p < att 


if and only if h(—q) = 1. This result suggests that finding a small prime p with (2) = 1 can be 


a deep problem (see appendix 8B for a discussion of small quadratic residues). 


For much more on the values taken by binary quadratic forms, particularly the 
prime values, we recommend David Cox’s wonderful book [I]. 
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Additional exercises 


These last questions get considerably more involved but may be of interest to stu- 
dents interested in further pursuing number theory. 


Exercise 12.6.1. Suppose that f(a, y) = ax? + bry + cy? is a reduced binary quadratic form. 
(a) Show that if am? + bmn + cn? < a— |b] +c with (m,n) = 1, then |ml, |n| < 1. 
(b) Prove that the least values properly represented by f are a < c < a— |b| +c, the first 
two properly represented twice, the last twice unless b = 0, in which case it is properly 
represented four times. 


Exercise 12.6.2. We now use the results of exercise|12.6.1]}to understand equivalences between 
primitive reduced binary quadratic forms. The idea is to recognize a reduced binary quadratic 
form by the smallest values it properly represents. 
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(a) 


Prove that: 
e If0 < |b] <a<c, then [a,b,c] properly represents a, c, and a — |b| +c in exactly 2, 2, 
and 2 different ways, respectively. 
e If0 < |b] =a <c, then [a,b,c] properly represents a, and c = a — |b| + c in exactly 2, 
and 4 different ways, respectively. 
e If0 < |b] <a=c, then [a,b,c] properly represents a = c, and a — |b| +c in exactly 4, 
and 2 different ways, respectively. 
e If 0 = |b| <a<c, then [a,b,c] properly represents a, c, and a — |b| +c in exactly 2, 2, 
and 4 different ways, respectively. 
e [1,1,1] properly represents 1 in exactly six different ways. 
e [1,0,1] properly represents both 1 and 2 in exactly four different ways. 
Deduce that if [a, b, c], and [A, B, C] are equivalent primitive reduced binary quadratic forms, 
then A=a, C=c, and B=b or —b. 
Use exercise[I2.6.1{a) to show that the entries of a matrix representing such an equivalence 
must each be —1, 0, or 1. 
Prove that distinct primitive reduced binary quadratic forms are all inequivalent. Together 
with Theorem[12.1|this implies that every positive definite binary quadratic form is properly 
equivalent to a unique reduced form. 
Suppose that M € SL(2,Z) transforms a primitive reduced binary quadratic form to itself 


(this is an automorphism). Show that M = +1, except in the following two cases: 
e [1,1,1] has automorphisms given by +/, 6 ) and + ‘© i 


e [1,0,1] has automorphisms given by +/ and ( 7 a 


Exercise 12.6.3. (a) Show that if [A,B,C] ~ [a,b,c], then [A, —B, C] ~ [a, —b, c]. 


(b) 


(f) 


(g) 
(h) 


Use exercise[12.6.2[d) to show that if [a,b,c] is reduced, then [a, b,c] ~ [a, —b, c] if and only 
ifb=0,b=a, ora=c. 

Deduce that [A, B,C] ~ [A,—B,C] if and only if they are equivalent to a quadratic form 
[a,0,c], [a, a,c], or [a, b, a]. 

Prove that [a, a,c] ~ [c, 2c — a, c]. 

If d < 0 is odd, then show that the primitive reduced forms are given by taking each 
factorization —d = rs with 0 <r < sand (r,s) =1, 


[a,a,c] if s > 3r wherea=r andc=(r+s)/4, 
[a,b,a] if s < 3r where a= (r+ s)/4 and b = (s—r)/2. 


If d < 0 is even, then show that the primitive reduced forms are given by taking each 
factorization —d/4 = rs with 0 <r < sand (r,s) =1, 


[a,0,c] witha=randc=s, 
[a,a,c] if s > 3r where a = 2r and c= (r+s)/2, 
[a,b,a] if s < 3r where a= (r+s)/2 andb=s—r. 


Note that the last two cases hold only if d/4 is odd. 

Show that each binary quadratic form either represents both r and s, or both 2r and 2s. 
(In (d), take f(1,-—2) = s in the first case; f(1,1) = s, f(1,—1) = 7 in the second case.) 
Deduce that if d < 0 is a fundamental discriminant, then there are exactly 2'~! reduced 
binary quadratic forms for which [a,b,c] ~ [a,—b,c], where t is the number of odd prime 
divisors of |d|, unless 4||d in which case there are 2°. 


Exercise 12.6.4. (a) Prove that x? + 6y? and 2x7 + 3y? are the only binary quadratic forms, 


(b) 
(c) 


up to equivalence, of discriminant —24. 

Prove that prime p can be written in the form a? + 6b? if and only if p= 1 or 7 (mod 24). 
Prove that prime p can be written in the form 2u? + 3v? if and only if p = 2 or 3, or p=5 
or 11 (mod 24). 
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We can refine this further: 
(d) Prove that prime p can be written in the form a? + 24B? if and only if p= 1 (mod 24). 
(e) Prove that prime p can be written in the form 8U? + 3v? if and only if p = 3, or p= 11 
(mod 24). 


Automorphisms of binary quadratic forms. 


Exercise 12.6.5. Suppose that f ~ g via the transformation M and that G is the group of 
automorphisms of f. 
(a) Prove that M~'GM is the group of automorphisms of g. 
(b) Prove that MG is the set of transformations yielding g from f. 
(c) Deduce that there are w(d) automorphisms of every primitive quadratic form of discriminant 
d, where w(—3) = 6,w(—4) = 4, and w(d) = 2 for all other discriminants d < 0. 


Exercise 12.6.6. (a) If N = f(a,b), then N = f(—a,—b). If N = a? + b?, then N = 6? + 
(—a)? = (—a)?+(—b)? = (—b)?+a?. If N = a? +ab+5?, then find five other representations 
of N by the quadratic form x? + ry + y?. 
(b) Explain how these representations correspond to the automorphisms of the quadratic form. 
(c) Why did we not include N = (—a)? + b? in the representations in part (a)? 


Exercise 12.6.7. (a) Let a,6,7,6 be given integers for which ad — By = 1. Prove that 8’, 6’ 
are integers for which ad’ — 8’y = 1 if and only if there exists an integer k such that 


(G &)-G aC 4): 


(b) If A = f(a,y) with (a,y) = 1, then prove that there exists a unique pair of integers 8,6 
such that f ~ [A,B,C] using the matrix M = C i) € SL(2,Z) for some integer B in 
the range -A< B<A. 

(c) Deduce that the proper representations of the integer A by reduced binary quadratic forms 
of discriminant d are in w(d)-to-1 correspondence with the solutions to B? = d (mod 4A) 
with -A<B<A. 


Exercise 12.6.8. Let f1,..., fp be the h = h(d) distinct reduced binary quadratic forms of 
discriminant d, where d = 0 or 1 (mod 4). Let r;(A) denote the number of proper representations 
of A by f;. Prove that 


On tae ery ge 5w(d)- utB’ (mod 44)2 Ba Ged 2A)} 
: d 
and that this equals w(d)- TT, 4 (1 + (4)) unless perhaps 4|(A, d). 


Exercise 12.6.9. Suppose that p is an odd prime for which (d/p) = 1. Prove that p is properly 
represented either by only the principal form of discriminant d, or by only two non-principal, 


reduced, binary quadratic forms of discriminant d, one, say, ax? + bry+cy, the other ax? — bry + 


cy?. 


Transformations of the upper half-plane. Let H := {z € C : Im(z) > 0} be 
the upper half-plane. We consider transformations with M = SL(2,Z) acting on 
z €C by taking MW (3) = (*) and considering this to be the map z > u/v. In 


Theorem [2] we saw that that every matrix in SL(2,Z) can be represented as a 


product of the two fundamental matrices S' = ( and T = (2, a 


Exercise 12.6.10. Prove that S represents the transformation z > z+ 1 and that T represents 
the transformation z > —1/z. 
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We define 
1 1 1 
Fai €C:|z| > land — 5 < Re(z) < ; fufzec : |z| = land — 5 < Re(z) < of. 


Figure 12.1. The shaded region is the fundamental domain F CH. 


Exercise 12.6.11.' Prove that the binary quadratic form ax? + bry + cy? with discriminant 
d <0 is reduced if and only if a EF. 


Exercise 12.6.12.1 Prove that for every z € C there exists M € SL(2,Z) such that Mz € F. 
Prove that M is unique. 


Exercise 12.6.13.4 Show that {MF: M € SL(2,Z)} is a partition of H into disjoint sets. 


The shaded region is F. Each enclosed region is a 
domain MF for some M € SL(2,Z). 


Appendix 12A. Composition 
rules: Gauss, Dirichlet, 
and Bhargava 


We study generalizations of the identity (9.1.1), which leads to a notion of “multi- 
plying” binary quadratic forms together, and hence to the group structure discov- 
ered by Gauss. We go on to study the reformulations of Dirichlet and Bhargava. 


12.7. Composition and Gauss 


In (9.1.1) we see that the product of any two integers represented by the binary 
quadratic form x? +y? is also an integer represented by that binary quadratic form. 
We now look for further such identities. One easy generalization is given by 


(12.7.1) (u? + Dv?)(r? + Ds?) = 2? + Dy® where x = ur + Dus and y = us — vr. 


Therefore the product of any two integers represented by the binary quadratic form 
x? + Dy? is also an integer represented by that binary quadratic form. For general 
diagonal binary quadratic forms (that is, having no “cross term” bay) we have 


(12.7.2) (au? + cv?) (ar? + cs”) = x + acy” where x = aur+cvs and y = us— vr. 


Notice here that the quadratic form on the right-hand side is different from those on 
the left; that is, the product of any two integers represented by the binary quadratic 
form ax? + cy? is an integer represented by the binary quadratic form x? + acy?. 


One can come up with a similar identity no matter what the quadratic form, 
though one proceeds slightly differently depending on whether the coefficient b is 
odd or even. The discriminant d = b? — 4ac has the same parity as b. If d is even, 
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then 
(12.7.3) (au? + buv + cv?) (ar? + brs + cs”) = 2 — ~y”, 


where x = aur + 2 (ur +us)+cus and y =rvu— su. 
If dis odd, then 


d—1 
(12.7.4) (au? + buv + ev?) (ar? + brs + cs”) = x? + xy — ae 
where x = aur + bur + pus + cus and y=rv— su. 


That is, the product of two integers represented by the same binary quadratic form 
can be represented by the principal binary quadratic form of the same discriminant. 


Exercise 12.7.1. (a) Prove that if n is represented by ax? + bry + cy”, then an is represented 
by the principal form of the same discriminant. 
(b) Suppose that d < 0. Deduce that if d is a square mod 4n, then there is a multiple an of n 
which is represented by the principal form of discriminant d, with 1 <<a< J |dl/3. 
(c) We obtained the bound 1 <a< Jd] when d is even in section [9.6] Use that method to 
find a bound in the case that d is odd. 


What about the product of the values of two different binary quadratic forms? 
If d is even, we have 


(12.7.5) (au? + buv + ev?) (r? — 48?) = ax? + bry + cy’, 


where x = ur + Bsu + cvs and y = ur — asu— bus. 


If d is odd, then 
(12.7.6) (au? + buv + cv?)(r? +rs — 4s?) = ax? + bry + cy’, 


where x = ur + ot) su + cvs and y = ur — asu— phys. 


That is, the product of an integer that can be represented by a binary quadratic 
form f and an integer that can be represented by the principal binary quadratic 
form of the same discriminant can be represented by f. 


Exercise 12.7.2. Suppose that a is a prime and d = b? — 4ac is even. Let D = —d/4. 

(a) Show that if a divides r? + Ds?, then a divides either r + (b/2)s or r — (b/2)s. 

(b) Prove that if r?+ Ds? = an, then there exist integers X,Y for which n = aX?2+bXY+cY?. 
If n is prime, then this result is true whether or not a is prime, but we will not prove that here. 
Assume though that is so. 

(c) Suppose that (d/p) = 1 and that ap is the smallest multiple of p that is represented by the 

principal form. Prove that a here must take the same value as in exercise 

(d) Prove that 1 < a < ,/|d|/3 and then use exercises [[2.4-4] and [12:6.1(b) to prove that if 


p< /|d|/2, then a= p. 


What about two different binary quadratic forms with no particular structure? 
For example, 


(4u? + 3uv + 5v”)(3r? + rs + 6s?) = 2x? 4+ xy + 9y? 


by taking x = ur — 3us — 2ur — 3us and y = ur+us+vr— vs. These are 
three inequivalent binary quadratic forms of discriminant —71. Gauss called this 
composition, that is, finding, for given binary quadratic forms f and g of the same 
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discriminant, a third binary quadratic form h of the same discriminant for which 


f(u,v)g(r, 8) = A(x, y), 
where x and y are quadratic polynomials in u, v, r, and s. 


These constructions suggest many questions. For example, are the identities 
that we found for two given quadratic forms the only possibility? Could the prod- 
uct of two sums of two squares always equal the value of some entirely different 
quadratic form? When we are given two quadratic forms of the same discriminant, 
is it true that there is always some third quadratic form of the same discriminant 
such that the product of the values of the first two always equals a value of the 
third? That is, is there always a composition of two given binary quadratic forms of 
the same discriminant? If so, can we determine the third quadratic form quickly? 


Gauss proved that one can always find the composition of two binary quadratic 
forms of the same discriminant. The formulas above can mislead one into guessing 
that this is simply a question of finding the right generalization, but that is far 
from the truth. All of the examples, through to (12.7.6), are so explicit 
only because they are very special cases in the theory. In Gauss’s proof he had to 
prove that various other equations could be solved in integers in order to find h 
and the quadratic polynomials x and y (which are polynomials in u, v, r, and s). 
This was so complicated that some of the intermediate formulas took two pages to 
write down and are very difficult to make sense off] We will prove Gauss’s theorem 
though we will approach it in a somewhat different way. 

Exercise 12.7.3. Given non-zero integers a, b,c,d prove that there exist integers m,n such that 


the set of integers that can be represented by (ar + bs)(cu + dv) as r,s,u,v run over the integers 
is the same as the set of integers that can be represented by mx+ny as x, y run over the integers. 


We finish this section by presenting a fairly general composition. 


Proposition 12.7.1. Suppose that ax? + bjxy + cy? fori =1,2 are binary qua- 
dratic forms of discriminant d such that q = (a1, a2) divides bie Then 


(12.7.7) (aya? + brary, + c1y?)(a2x3 + boreye + cays) = a3X3 + b323y3 + c3y3 


where a3 = a1a2/q? and b3 is any integer simultaneously satisfying the following 
(solvable) set of congruences: 


b3 =d_ (mod 4aja2/q’), 
bg =b, (mod 2a,/q), b3 = bz (mod 2a2/q), 
b3 (by + bz) = by be +d (mod 4a,a2/q), 


and cz is chosen so that the discriminant of a3x3 + b3x3y3 + c3y3 is d. 


Exercise 12.7.4. Show that the above congruences for b3 can be solved. 


Proposition [[2.7.1] implies that we can always compose two binary quadratic 
forms f and g of the same discriminant, whose leading coefficients are coprime. 


4See article 234 and beyond in Gauss’s book Disquisitiones Arithmeticae (1804). 
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Proof sketch. Computer software verifies that (12.7.7) holds taking a3 = a,a2/q’, 
for any integer q dividing (a, a2), with 


ae ere eal y bi Pes, bia shi = Oate 
3 = GX1%2 2a9/4 1Y2 4 2a,/4 2Y1 4a,a9/q Y1Y2; 
a a by +b 
and ¥3 = ae L1Y2 + Coy a “Y1Yy2- 
q q 2q 


To ensure that we are always working with integers, the coefficients of x3 and ys 
must be integers. So this formula works if we can find integers gq and b3 for which q 
divides a1, a2, and Fitba | and the above four congruences hold simultaneously for 
integer bs. It is difficult to determine whether there is such a bs for an arbitrary q, 

bi+b2 


but not so challenging if q = (a1, a2) divides “>. 


Corollary 12.7.1. For any given integers a,b,c,h,k we have 
(ab, hk, ch) - (ac, hk, bh) ~ (ah, hk, bc). 


Proof. We multiply (ab, hk, ch) and (ac, hk, bh) ~ (bh, —hk, ac) using the proof of 
Proposition [2.7.1] We take q = b so that a3 = ah and 2q|b; + b2 = 0. Selecting 
bz; = hk we find that the congruences of Proposition [12.7.1] reduce to d = (hk)? 
(mod 4abh), which follows from d = (hk)? —4abch. Hence we have that (ab, hk, ch)- 
(ac, hk, bh) ~ (ab, hk, ch) - (bh, —hk, ac) ~ (ah, hk, bc). 

To get more symmetry in the statement of the result we note that (ah, hk, bc) - 
(bc, hk, ah) = 1, and so 


(ab, hk, ch) - (ac, hk, bh) - (bc, hk, ah) ~ 1. 


12.8. Dirichlet composition 


Dirichlet claimed that when he was a student, working with Gauss, he slept with 
a copy of Disquisitiones under his pillow every night for three years. It worked, 
as Dirichlet found a way to better understand Gauss’s proof of composition, which 
amounts to a straightforward algorithm to determine the composition of two given 
binary quadratic forms f and g of the same discriminant. 


Exercise 12.8.1. Given any primitive binary quadratic form f(z, y) € Z[x, y] and non-zero inte- 
ger A, prove that there exist integers r and s such that f(r,s) is coprime to A. Deduce that there 
exists a binary quadratic form g, for which f ~ g, with (g(1,0), A) = 1. 


Exercise 12.8.2. Suppose that f(x,y), F(X,Y) are two binary quadratic forms, with disc(f) = 
disc(F’) (mod 2), for which f(1,0) = a is coprime to F(1,0) = A. Prove that there exist quadratic 
forms g = ax? + bry + cy? and G = AX? + bXY + CY? with the same middle coefficient, such 
that f~g and F~G. 


Now suppose we begin with two quadratic forms of the same discriminant. 
Let A be the leading coefficient of one of them. Then the other is equivalent to 
a quadratic form with leading coefficient a, for some integer a coprime to A, by 
exercise [12.8.1] Then these are equivalent to quadratic forms g = ax? + bry + cy? 
and G= AX?+bXY +CY’, respectively, by exercise [12.8.2] Since these have the 
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same discriminant we deduce that ac = AC and so there exists an integer h for 
which 


g(x,y) = ax? +bry+ Ahy? and G(x,y) = Ax? + bay + ahy?. 
Then 
H(m,n) = g(u,v)G(r, s) with H (a, y) = aAx? + bry + hy’, 
where m = ur — hus and n = aus + Avr + bus. 


Dirichlet went on to interpret this in terms of what we would today call ideals; and 
this in turn led to the birth of modern algebra by Dedekind. In this theory one is 
typically not so much interested in the identity, writing H as a product of g and G 
(which is typically very complicated and none too enlightening), but rather in how 
to determine H from g and G. Dirichlet’s proof goes as follows: 


The ideal J (eee. a) is associated to a given binary quadratic form ax? + 


bry +cy? (see section [2.10] of appendix 12B). Therefore when we multiply together 
g and G, we multiply together their associated ideals to obtain 


rat (e“) (Ye), 
2 2 


which contains aA as well as both a- enya and A- aie, Since (a, A) = 1 there 
exist integers r,s for which ar + As = 1 and so our new ideal contains 


—b+Vvd —b+Vd _ —b+Vd 
a: +s5-A- = . 
2 2 2 
Therefore _ 
r=1(5%4 aa) 


which is the ideal associated with the binary quadratic form H. 


Defining the class group. We now know that we can multiply together the 
values of any two quadratic forms of the same discriminant and get another. Since 
there are only finitely many equivalence classes of binary quadratic forms of a given 
discriminant this might seem to lead to a group structure, under multiplication. 
To prove this we will need to know that the usual group properties hold (most 
importantly, associativity), and also that the values of a binary quadratic form 
classifies the form. Unfortunately this is not quite true. In exercise we 
saw that the only issue in distinguishing between the values taken by forms is 
perhaps the values taken by ax? + bry + cy” and au? — buv + cv?. However there 
is an automorphism u = x,v = —y between their sets of values so they cannot be 
distinguished in this way. On the other hand, the ideals 


(4. and (4) 


2 


are quite distinct, and so multiplying ideals (and therefore forms) using Dirichlet’s 
technique leads one immediately to being able to determine a group structure. This 
is called the class group, since the group acts on equivalence classes of ideals (and 
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so of forms). In this approach, associativity follows easily, as multiplication of the 
numbers in the ideals multiply associatively, and it is similarly evident that the 
class group is commutative. Therefore the class group is a commutative group, 
acting on the ideal classes of a given discriminant, with identity element given by 
the class of principal ideals (which correspond to the principal form). 

We will now give a useful criterion to determine how to take square roots inside 
the class group. 
Proposition 12.8.1. If f is a binary quadratic form of fundamental discriminant 
d which represents the square of an odd integer, then there exists a binary quadratic 
form g of discriminant d for which g.g~ f. 


Proof. We begin by squaring the primitive form ax? + bry + acy”. Then 


2 
J:=!1 (54) 


contains a?,a- abt and (=btva)2 = —a’c b(=btv4), Therefore J contains 


a: =biva and b- abv Now (a, b) = 1 or else our original form was not primitive, 


and so J contains eid: Therefore 


ra1(@ a) 


and the corresponding binary quadratic form is a?x? + bry + cy?. 
One can justify this by finding a suitable multiplication of forms, namely, 
(ar? + brs + acs”)(au? + buv + acu”) = a?a? + bry + cy’, 
where x = ru—csv and y = asu+ arv + bsv. 


Now if f represents a? with (a,d) = 1, then there exist integers b,c such 
that the quadratic form F := a?x? + bry + cy? is equivalent to f. Note that 
(a, b)? divides d = b? — 4a?c, which is a fundamental discriminant and so squarefree 
except perhaps a power of 2. However a is odd and so (a,b) = 1. Therefore we let 
g = ax? + bry + acy? so that, as in the previous paragraph g-g ~ F ~ f. 


12.9. Bhargava compositior{’| 
Let us begin with one further explicit composition, a tiny variant on (12.7.3) (letting 
s — —s there): 
(au? + 2Buv + cv”) (ar? — 2Brs + cs”) = x? + (ac — B?)y? 
where x = aur + B(vr — us) — cvs and y = us + ur. 


Combining this with the results of the previous section suggests that if the discrim- 
inant d is divisible by 4 (which is equivalent to b being even), then 


(12.9.1) F(u,v)G(r, s)H(m, —n) = P(a,y) 


5 Although there is no Nobel Prize in mathematics, there is the Fields Medal, awarded every four 
years, only to people 40 years of age or younger. In 2014, in Korea, one of the laureates was Manjul 
Bhargava for a body of work that begins with his version of composition, as discussed here, and allows 
us to much better understand many classes of equations, especially cubic. 
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where P(x, y) = x7 — fy? is the principal form and « and y are cubic polynomials 
in m,n,r,s,u,v. Analogous remarks can be made if the discriminant is odd. 

In 2004 Bhargava came up with an entirely new way to find all of the triples 
FG, H of binary quadratic forms of the same discriminant for which holds: 
We begin with a 2-by-2-by-2 cube, the corners of which are labeled with the integers 
a,b,c,d,e, f,g,h. 


41 | 


b 


Figure 12.2. Bhargava’s Rubik-type cube. 


There are six faces of a cube, and these can be split into three parallel pairs. To 
each such parallel pair consider the pair of 2-by-2 matrices given by taking the 
entries in each face, those entries corresponding to opposite corners of the cube, 
always starting with a. Hence we get the pairs 


_{@ 0) fe Ff\. _ faxtey bat fy 
Mie) = (° i) = (; i) — (“ + gy dx+hy)]’ 
{aie b d\ — fax+by ca+dy 
Mo(x, y) —_ (° ‘a+ . us ea fy gx + i ’ 
_{a@ 6) fe d\ _ fax+cy be+dy 
Ma (’ ') = (; i) a & +gy fat i) 
where we have, in each, appended the variables, x,y, to create matrix functions of 
x and y. The determinant, —Q,(z,y), of each M;(, y) is a quadratic form in z and 
y. Incredibly Q1, Q2, and Q3 all have the same discriminant and their composition 


equals P, the principal form, just as in (12.9.1). We present two proofs. First, by 
substitution, one can exhibit that 


Qi(z, —y) = Q2(x2, y2)Q3(73, ys) 
where 


v= (eum) (Sa) (G2) ma z= (5 i) (3) 


Let’s work though an example: Plot the cube in three dimensions, take the 
Cartesian coordinates of every corner (each 0 or 1), and then label the corner 
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(x,y,z), with 27x” + 2y +z, squared. Hence 
a, b, Cc, d, €é, ne g, h = ae 6°, 04,45, 3% es ee De 
yielding the cube in Figure 12.3. 


et 


52 


a a 


Figure 12.3. The construction of three binary quadratic forms using Bhar- 
gava’s cube. 


0 4 


This cube leads to three binary quadratic forms of discriminant —7 - 4*: 
Qy = —4?(4x?+132y+11y?), Qo = —2?(x?—2ay+29y?), and Q3 = 4?(82?+52y+y?). 
After some work one can verify that 
Qi(m,n)Qa(r, 8)Qz(u, v) = A(x? +4? - Ty”), 
where x and y are the following cubic polynomials in m,n,r, s, u,v: 


x = 8(—llmru — 38mrv + 25msu + 17msv — 17nru — 4nrv + 59nsu + 32nsv) 


and y=mru+mrv + 2lmsu + 5msvu + 3nru + 2nrv + 3lnsu + bnsv. 


Bhargava proves his theorem, inspired by a 2-by-2-by-2 Rubik’s cube. His idea 
is to apply one invertible linear transformation at a time, simultaneously to a pair 
of opposite sides, and to slowly “reduce” the numbers involved, while retaining the 
equivalence classes of Qi, Q2, and Q3, until one reduces to a cube and a triple of 
binary quadratic forms with coefficients having a convenient structure. 


Lemma 12.9.1. Jf one applies an invertible linear transformation to a pair of 
opposite sides, then the associated binary quadratic form is transformed in the usual 
way, whereas the other two quadratic forms remain the same. 


Therefore we can act on our cube by such SL(2, Z)-transformations, in each 


direction, and the three binary quadratic forms each remain in the same equivalence 
class. 


Proof. If ce ) €SL(2, Z), then we replace the face 


ca) Caer gam Ga) & Carts a) 
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Then M(x, y) gets mapped to 


te aor wAbertts a+ G aap 


that is, M,(ax+-yy, bx + dy). Therefore the quadratic form Q1(x, y) gets mapped 
to Qi(ax+yy, Bx + dy) which is equivalent to Q1 (x,y). Now M(x, y) gets mapped 


to 
aateB ca+tgf ba+fB da+hB\ — fa B : 
ts +e6 ae oy 6 + fd dy+hé y= y 6 M2(x, y); 
hence the determinant, —Q2(, y), is unchanged. An analogous calculation reveals 


that Ms(x,y) gets mapped to : M3(x,y) and the determinant, —Q3(, y), 


is also unchanged. 


The previous lemma allows one to proceed in “reducing” the three binary qua- 
dratic forms to equivalent forms that are easy to work with (rather as in Dirichlet’s 
proof). 


Proof of the Bhargava composition. We will simplify the entries in the cube 
by the following reduction algorithm: 

e We select the corner that is to be a so that a £ 0. 

e We will transform the cube to ensure that a divides b, c, and e. If not, say 
a does not divide e, then select integers a, 3 so that aa + e3 = (a,e), and then let 
y = —e/(a,e), 6 =a/(a,e). In the transformed matrix we have a’ = (a,e), e' = 0, 
and 1 < a’ <a—1. It may well now be that a’ does not divide b’ or c’, so we repeat 
the process. Each time we do this we reduce the value of a by at least 1; and since 
it remains positive this can only happen a finite number of times. At the end of 
the process a divides b, c, and e. 

e We will transform the cube to ensure that b = c =e = 0. We already have 
that alb,c,e. Now select a=1, 6 =0,7 = —e/a, 6 =1, so that e’ =0,0 =b,c = 
c. We repeat this in each of the three directions to ensure that b =c=e=0. 

Replacing a by —a, we have that the three matrices are 


—a 0 0 
Mi (2,4) = ( - i) e+ i: :) ys so that Qi(x,y) = adx* + ahxy + fgy”, 


mina d)ee( 


d 
h 
—a 0 0 d 
M3(2,y) := ( 5 ) a+ ( y, so that Q3(x,y) = afx? + ahry + dgy”. 


All three Q; have discriminant (ah)? — 4adfg, and we observe that 


y, so that Qo(x,y) = age? + ahay + dfy’, 


Qi(fy2r3 + gx2y3 + hyzy3,av2x3 — dyzys) = Q2(x2, y2)Q3 (x3, ys) 


where x1 = fy2x%3 + gx2y3 + hyay3 and y; = axgx3 — dy2y3. 


This brings to mind the twists of the Rubik’s cube, though in that case one has 
only finitely many possible transformations, whereas here there are infinitely many 
possibilities, as there are infinitely many invertible linear transformations over Z. 
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Appendices. The extended version of chapter 12 has the following additional 
appendices: 


Appendix 12B. The class group is a group whose elements are the equivalence 
classes of quadratic forms with multiplication defined by composition as in appendix 
12A. We will focus on classifying the all-important elements of order two. 


Appendix 12C. Binary quadratic forms of positive discriminant. We have al- 
ready explored at length the theory of binary quadratic forms of negative discrim- 
inant. Positive quadratic forms are quite a bit trickier, largely because there are 
infinitely many automorphisms of the solutions of a quadratic equation of this 
discriminant, corresponding to the solutions to Pell’s equation, whereas for nega- 
tive discriminants there is usually just the one non-trivial automorphism (xz, y) > 
(—ax,—y). Here we present some of that theory. 


Appendix 12D. Sums of three squares. We discover here the connection between 
sums of three squares and class numbers and then develop Dirichlet’s class number 
formula. 


Appendix 12E. Sums of four squares. We give two proofs that every positive 
integer is the sum of four squares, including one via the theory of quaternions, and 
then discuss how many representations each integer has as the sum of four squares. 


Appendix 12F. Universality. A quadratic form is universal if it takes all positive 
integer values. Although these were classified long ago by Ramanujan it was only 
recently that researchers found a much neater classification: simply verifying that 
the quadratic form represents every integer up to 290. 


Appendix 12G. Integers represented in Apollonian circle packings. In appendix 
9C we developed some of the mathematics of the curvatures inside a circle tiled by 
smaller circles. Now we show how some subset of the integers represented can be 
found by reducing the question to values of binary quadratic forms. 


Hints for exercises 


EXERCISES IN CHAPTER 0 
Exercise[0.1.1[b). The key observation is that if a = 1+v6 or dese. then a? = a+1 and 
so, multiplying through by a”~?, we have a” = a"~!+a”~? for all n > 2. 
Exercise b). Multiplying through by ¢ we have ¢"*! = F,,¢? + F,_1@. Now use (a). 
Exercise b). Determine a and b in terms of a and then c and d in terms of a, x0, 
and 71. 
Exercise [0.2.1(a). Note that N? + (2N +1) =(N +1). 
Exercise [0.3.1] In both ea use induction on n. 
Exercise [0.4.2] Use (0.1.1) to establish that |F, — ¢"/V5| < 4 for all n > 0. 
See If the first character in a string in Ap is a ne what must the subsequent 
string look like? What if the string begins with a 1? 
Exercise [0.4.8] Use Gauss’s trick to show that Vacn<o” = ee iS (= ee 
a product = two integers of opposite parity, both > 1. Show that if N is not a bowee of 
2 (so that it has an odd divisor m > 1), then it is a product of two integers of opposite 
parity, both > 1. Determine a and b in terms of N and m. 
Exercise a). Verify this for k = 1 and 2, and then for larger k by induction. 

(b) a k and m as functions of n. 


ExerciseWZIG By (LD, V5F. = 4" - 3", and so (V5F.)* = Dh, (*)(-1)%p? where 
py := 6 be-F. Let 2*tt — 4 ca’ =J]*_, (a — p;). Therefore 


j=0O 
2 ci(V5Fnti)” = yy (‘) (—1)’ 93 - S cip} = » (‘) (—1)’p3 «oj? = (V5 Fate)” 
i=0 j=0 i=0 j=0 


The result follows after dividing through by (/5)*. 
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EXERCISES IN CHAPTER 1 
Exercise [1.1.1{a). Write a = db for some integer d. Show that if d # 0, then |d| > 1. 


(b) Prove that if wu and v are integers for which uv = 1, then either u = v = 1 or 
u =v =-—l1. (c) Write b = ma and c = na and show that bx + cy = maz + nay is divisible 
by a. 


Exercise Use Lemma and induction on a for fixed b. 


Exercise [L.2.J{a). By exercise [LI.1/c) we know that d divides au + bu for any integers u 
and v. Now use Theorem (d) First note that a divides 6 if and only if —a divides b. 
If |a| = gcd(a, b), then |a| divides both a and 8, and so a divides b. On the other hand if 
a divides b, then |a| < gcd(a, b) < |a| by (c). 

Exercise[L.2.4{b). Let g = gced(a, b) and write a = gA,b = gB for some integers A and B. 
What is the value of Au + Bu? Now apply (a). 


Exercise [1.2.5{a). Use Theorem 
Exercise [1.4.2] Use Lemma[I.4.1] 


Exercise[L.7.5{e). Write r = m-+6 where 0 < 6 < 1, so that [r] =m anda—r =a—m—6 
so that [a — r] =?. 
Exercise [1.7.10} Given any solution, determine uw using Lemma|l.1.1 


Exercise [1.7.11] One might apply Corollary [2.2] 
Exercise [I.7.14(d). Use exercise [7.10] 


Exercise [7.22] For each given m > 1, prove that am|%m, for all r > 1, by induction on 
r, using exercise [0.4.10[a) with k = rm. 
Exercise [L7.23(a). Prove that gcd(an,b) = gcd(aavn_1,b) for all n > 2, and then use 


induction on n > 1, together with Corollary [22.2] (b) Prove that gcd(an,an-1) = 
gcd(ban—2,¢n—1) for all n > 2, and then use induction on n > 1, together with Corollary 
1.2.2] (c) Use exercise[0.4.10(a) with k = n—m and then (b). (d) Follow the steps of the 
Euclidean algorithm using (c). 


Exercise Use the matrix transformation for (uj,uj+1) 4 (uj41, Uj+2)- 


EXERCISES IN CHAPTER 2 


Exercise 2.1.4{b). Write the integers in the congruence class a (mod d) as a+nd as n 
varies over the integers, and partition the integers n into the congruences classes mod k. 


Exercise [2.1.5] Write the congruence in terms of integers and then use exercise [L.1.I[c). 
Exercise [2.1.6] Write the congruence in terms of integers and then use exercise e). 
Exercise c). Factor 1001. 

Exercise [2.5.4{a). Split the integers into k blocks of m consecutive integers, and use the 
main idea from the first proof of TheoremP.1] (b) Write N = km+r with 0 <r<m-1. 


Use (a) to get k& such integers in the first km consecutive integers, and at most one in the 
remaining r. Compare k or k + 1 to the result required. 


Exercise b). Use the results for m = 4 from (a). (d) Use the same idea as in (c). 
(e) Study squares mod 8. 


Exercise 2.5.9{b). Use that + (—1) = 4(°). 


p\j 
Exercise [2.5.10[a). Treat the cases a > b and a < b separately. (b) Treat the cases c > d 
and c < d separately. 

Exercise [2.5.13] Proceed by induction on k > 1. 


Exercise [2.5.15|b). Use induction. 
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Exercise [2.5.16{a). Try a proof by contradiction. Start by assuming that the kth pigeon- 
hole contains a, letters for each k, and determine a bound on the total number of letters 
if each az <1. (b) Use the pigeonhole principle. (c) Use induction. 


Exercise [2.5.17[a). Use the pigeonhole principle on pairs (x; (mod d),%,41 (mod d)). 
(d) Use exercise [1.7.24 


EXERCISES IN CHAPTER 3 


Exercise [3.0.1] The only divisors of p are 1 and p. Therefore gcd(p,a) = 1 or p, and so 
gcd(p,a) = p if and only if p divides a. This implies that gcd(p,a) = 1 if and only if p 
does not divide a. 
Exercise [3.1.1} Use induction and the fact that every integer > 1 has a prime divisor, 
as proved in the “prerequisites” section. (The proof will appear as part of the proof of 
Theorem [3.2] ) 
Exercise a). Apply Theorem B.1] with a = a; ---a,—1 and b = ax, and if p divides a, 
then proceed by induction. 

(b) p divides some q; by (a), and as gq; only has divisors 1 and q;, and as p > 1, we 
deduce that p = qj. 
Exercise b). Write n = 2*m with m odd. Then n has an odd prime factor if and 
only if m > 1. Therefore if n has no odd prime factor, then n = 2°. 
Exercise We have [a,b] = ab by Corollary The result follows from Lemma 
14.1 
Exercise [3.3.1] Look at this first in the case that m and n are both powers of p, say, 
m = p* and n = p’. If d divides m and n, then d = p°, say, with c < a and c < b. 
The maximum c that satisfies both of these inequalities is min{a, b}. Similarly if m and 
n divide L = p®, then a < e and b < e and so the minimum e that satisfies both of these 
inequalities is max{a,b}. Now use this idea when m and n are arbitrary integers. 


Exercise Use exercise d). 
Exercise [3.3.7(c). Use exercise [3.3.3{c). 
Exercise [3.5.1(a). Show that the aj + b are distinct mod m. 


Exercise [3.5.2] Prove that the r; (mod m) are all reduced residues, and then that they 
are distinct. 

Exercise If ar =c (mod b), then b divides ar — c. Therefore gcd(a, b) divides ar —c 
and so c. In the other direction, we write g =gcd(a, b) and so a = gA,b = gB,c = gC, 
and we are looking for solutions to Ar = C (mod B). Then use exercise b). 
Exercise Use the second proof of Corollary 


Exercise If am + bn = c, then am + bn = c (mod b) (or indeed mod any integer 
r > 1). On the other hand if au+ bv = c (mod b) and m is any integer = u (mod 6), then 
am = au + bv =c (mod B) and so there exists an integer n for which am + bn = c. 


Exercise [3.7.2(a) We proceed by induction on the number of moduli using exercise B.2.1] 
(b) Replace m in (a) by m—n. 


Exercise[3.7.8{a). Work with the prime power divisors of m and use the Chinese Remainder 
Theorem. 


Exercise Calculate the product mod p’°, for every prime power p*||m. 
Exercise [3.9.1] Use exercise [L.7.20{a). 
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Exercise [3.9.3{a). If 2k +1 = n/m, take u=a™ and v = 8™ in 


y2kth 4 y2ktl 


= (—ww)* + 3 (un) (uw +0), 


UTU 


so that yn/Yym is a linear polynomial in the y2;m with coefficients that are + powers of b. 
Exercise Use exercise B.3.7(c), and factor gA” — gB?. 
Exercise a). Write —~ as a polynomial in y and z. 
Exercise [3.9.10{a). J/2 + V3 is a root of x* — 1027 + 1. Use Theorem 

(b) a+ Vb is a root of x* — 2(a + b)x? + (a — b)*. Therefore the rational root 
m= /at Vb must be an integer, and then m divides a — b. Writing a = b+ mk we have 
k = /a— Vb so that b = (™=*)? anda = (™}4)’. 
Exercise B.9.11(b). Prove that (/d + m)(/d— m) is an integer 
Exercise [3.9.15{b). Use Corollary 


Exercise[3.9.17[b). Write m = gM and n= gN where g = gcd(m,n) so that (M,N) = 1, 
and then use exercise[3.7.7] (or exercise [3.9.16[b), for a less complete solution). 


Exercise [3.10.2] Write the trinomial coefficient as the product of binomial coefficients. 


Exercise|3.11.1} Prove this by induction on n > 1, using the observation in the paragraph 
immediately above. 


EXERCISES IN CHAPTER 4 


Exercise [4.0.1] One can proceed by induction on the number of distinct prime factors of 
n, using the definition of multiplicative. 


Exercise Pair m with n — m, and then m with n/m. 
Exercise If the prime factors of n are pi < po <--- < pr, then pj > k+ J and so 
o(n) =TI* pj-l = ae ktj=1._ kk 1 

n j=1 pj j=l k+5 2k 2° 
Exercise Let £ = (d,a) so that £\a and therefore d/é|(a/€)b with (d/l,a/£) = 1 and 
therefore m = d/€|b. 
Exercise b). What is the power of 2 in o(n)? 
Exercise Give a general lower bound on o(n). 

F e 1 e e ie ie Bp obs 
Exercise [4.2.5{a). If p*||n, then 1+ 5 < o(p*)/p° <1+5+4+ 74 
(b) If n is a perfect number, then o(n)/n = 2, and if it is odd with < 2 prime factors, 


then |], 1, s-7 < 3.3 which is < 2, contradicting (a). 


Exercise [4.3.7(a). Use exercise[3.9.15[a). 


Exercise [4.3.1]/a). Prove this when a and 6 are both powers of a fixed prime and then 
use multiplicativity. 


Exercise [4.3.12] In both parts write, for each d\n, the integers m = an/d with (a,d) = 1. 
Use exercise [4.1.3] 

Exercise [4.3.13/a). You could use the second part of exercise [4.1.3 

Exercise [4.3.15{(b). Use multiplicativity. (e) Use exercise [4.2.5 

Exercise [4.5.I[a). Use the binomial theorem. (b) Let m = ]],),,p and « = —1 in (a). 
Exercise [4.5.2] Expand the right-hand side. 


Exercise Let r = (a,m) and then s = a/r and t = m/r which therefore must be 
coprime. Now a = rs divides mn = rtn, so that s divides tn and therefore s divides n as 
(s,t) =1. Let u = n/s and we finally deduce b = mn/a = tu. 


— _P 
p-1° 


Hints for exercises 255 


Exercise Use the expansion ¢(n) = )74),, #(n/d)d from the proof of Theorem [4.]]in 
section 4.4, and a similar expression for o. 


EXXERCISES IN CHAPTER 5 


Exercise Show that if 22" <a < 2?", then there are > n primes up to z. Then 
give a lower bound for n as a function of x. 

Exercise Show that if every prime factor of n is = 0 or 1 (mod 3), then n= 0 or 1 
(mod 3). 

Exercise |5.3.4| Consider splitting arithmetic progressions mod 3 into several arithmetic 
progressions mod 6. 

Exercise One might use exercise b) in this proof. 


Exercise.4.1|b). We wish to show that m(a+ex) > (x). By (6.4.2) (and footnote 14) we 
know that for any fixed 6 > 0 we have (1 —6)5e5 < (2) < (1+ 4) 55 if @ is sufficiently 


log x lo 


large. The result will then follow if the middle inequality holds in 


x e+ Ex 


1+6 6 
n(x) < ( eee < ee ee) < n(x + ex) 
Now we < 1+ yea as log(1 + €) < «€, and so the middle inequality follows if 
1+ pay < (1—-4)(1+6)/(1 +4). Selecting, say, 6 = €/3 this holds if « is sufficiently large. 


Exercise |5.8.11} Use l’Hopital’s rule. 


Exercise [5.8.12] First prove that (Li(a) wz) | wir »>las@—oo. 


Exercise [5.8.14{a). Use Corollary [2.3.1] 


Exercise [5.9.1] Either use Kummer’s Theorem (Theorem [3.7) or consider directly how 
often p divides the numerator and denominator of pig 


Exercise Use induction to show that, for each n > 6, every integer in [7,2N + 6] is 
the sum of distinct primes in {2,3,...,2N}, by induction on N > 1. 

Exercise [5.9.6] Let p be a prime in [2n, 4n]. Now construct all the pairs you can that sum 
to p. Proceed. 


Exercise [5.10.1] Maximize the log of the ratio using calculus. 


Exercise [5.10.2] Use Proposition [5.10.1] 

Exerciseb.10.3[a). If r < s/2, then by Bertrand’s postulate there is a prime p € (s/2,s] C 
(r,s]. Otherwise k = s—r <r. In either case, by Bertrand’s postulate or the Sylvester- 
Schur Theorem, one term has a prime factor p > k, and so this is the only term that can 
be divisible by p. 


Exercise [5.11.8[b). Use the Fundamental Theorem of Algebra mod p (see Lagrange’s 
Theorem, Proposition|7.4.1). 


Exercise [5.11.9{a). Can be proved by induction on k. For k = 0 this is trivial. For larger 
k, let T C {1,2,...,m-—1} and we pair together the terms for S = T and S = TU {m} 
in our sum. The sum therefore becomes 


ys co” (on-+20+ 5221) - (+52) 


TET1;2;...%0—1} ger jeT 


e (‘) ey a 


i=0 TC{1,2,...,m—1} g€T 


and the result follows by induction, asm—1>k—12>i%. 
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(b) Let xo = logn and if n has prime factors pi,...,pm, then let x; = —logp; for 
each j > 1. 

(c) We get kla1...2% in (a) and so (-1)"! TI in logp in (b). We prove this by 
induction using the proof in (a), since in the induction step only the 1 = k — 1 term 
remains, which is the result from the previous step multiplied by kx. 

EXERCISES IN CHAPTER 6 


Exercise[6.1.2] Study where lines of rational slope, going through the point (2,1), hit the 
curve again. 
Exercise Write down an equation that identifies when three given squares are in 
arithmetic progression. 
Exercise a). By (G11) the area is g?rs(r? — s”) where r > s > 1 and (r,s) = 1. If 
this is a square, then each of r, s, and r? — s? must be squares; call them 2”, y’, and 2”, 
respectively, so that 2* — y* = z?, which contradicts Theorem [6.2] 

(c) Consider a right-angled triangle with sides x”, 2y”, z. 
Exercise[6.5.3] Here b is the hypotenuse, and c is the area. Further hint: We need b? — 4c 
and b? + 4c to be integer squares, say, u2 and v”, so that 4c = b? — u? = v? — b?. Therefore 
2b? = u? + v’, so u,v have the same parity and therefore (44%)? + (452%)? = 0’. This is 
v2 b2 | b2 2 


* * 1. utv . v-u u 
our Pythagorean triangle, which has area 5 - “3%. S* = Fz =. 


Exercise Let a = p/q with (p,q) = 1 so that a = (aa + b)/a = (ap + bq)/p. Now 
(p,q) = 1 so comparing denominators we must have q = 1, and p divides ap + bq, so that 
p divides bq, and therefore b. 


Exercise [6.5.7] By (6.1.1) the perimeter of such a triangle has length 2grs + g(r? — s”) + 
g(r? +s”) = 2gr(r +s) where r > s > 0. Therefore n has divisors r and r +s, where 
r<r+s< 2r. On the other hand if n has divisors d1,d2 for which d; < dz < 2d,, then 
we may assume they are coprime, by dividing through by any common factor. Therefore 
didz divides n and so we can let r = di, s = dz — di, and g = n/dido. 
Exercise[6.5.9] Prove that ifn > 13, then (n+1)?+128 < 2n?. Then proceed by induction 
on n for m € [n? + 129, 2n?). 
Exercise [6.5.10} What values can cubes take mod 9? 

EXERCISES IN CHAPTER 7 
Exercise b). Use the technique in the proof of Lemma|?.1.1 
Exercise [72.2] Let k := ordm(a) and A = {1,a,a’,...,a*~' (mod m)}. Show that if b 
and b’ are any two reduced residues mod m, then either bA and b’A are disjoint or are 
equal. Therefore the sets of the form bA, where b is a reduced residue mod m, which are 


each of size k, partition the ¢(m) reduced residues mod m. This implies that k divides 
$(m) as desired. 


Exercise Let k := ordq(2). We have 2? = 1 (mod q) and so k divides p by Lemma 
Therefore k = 1 or p, but k #4 1 as 24 #1 (mod q). 


Exercise [7.4.1la) If n is not of the form p or p*, write n = ab with 1 <a <b. Ifn=p’, 
then n fae p-: 2p. 


Exercise [74ala). If Q = 25+, then 
(p— 1)!/Q! = (p— 1)(p— 2)--- (p— Q) = (-1)(-2)--- (-Q) = (-1)°Q!_ (mod p). 


Exercise[7Z5-2(b). As (g 2 *)2 = g?-' =1 (mod p), so gi is a square root of 1 (mod p); 
that is, g-2 =1 or —1 (mod p). But g has order p— 1 and so gt #1 (mod p). 


Exercise [7.10.2] Use Proposition [7.4.1] 
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Exercise [7.10.4] In every solution n,n — 1,n —2 have prime factors 2,3,p for some p > 3. 
At most one of these integers is divisible by p. Show that the other two lead to a solution 
to 2” — 3 = +1 and use exercise [7.10.3] 


Exercise[7.10.5(b). Use Theorem 

(d) Make sure a is chosen so that (q,a— 1) =1. 
Exercise [7.10.6[a). The trick is to write z? = ((z — y) + y)” and then use the binomial 
theorem. One can also write tn = ae and use exercise [2.5.20[a). 
Exercise Take the j and p — j terms together. 
Exercise [7.10.13] Let M = ao +1 so that an = 2”M — 1 for all n > 0. Let p be an odd 
prime dividing a;. Then p divides ap. 
Exercise [7.10.16(b). Since n is not a Carmichael number, the subgroup in (a) is proper 
and so contains at most half the reduced residues. (c) Let gq = 2p—1. Nown-—1=p-1 
(mod 2p — 2), so that if (a,n) = 1, then a"~' = a?~'! = 1 (mod p) and a”! = a? = 
att =+1 (mod q). 
Exercise [7.10.1((a). M, — 1 = 2” — 2 is divisible by p. 
Exercise [7.12.1[b). Let f(21,...,@p) = (@2,...,%p,@1) in part (a). 


EXERCISES IN CHAPTER 8 
Exercise b). Use Lemma[8. 1.1 


Exercise [8.1.3{a). Use that ( m ) (5) 


Exercise[8.1.0/a). The residues 1, 9”, g*,...,g?~*? (mod p) are evidently distinct and non- 
zero squares. As there are a of them, they are all of the quadratic residues by Lemma 


8.1.1 


(b) We see above that g = g' is not one of the quadratic residues. 


(£1)? =1. 


Exercise There are two solutions to r? = a (mod p), say, r and —r (mod p), whose 
product is r-(—r) =—a (mod p). Note also that |S| = ?5%. 

Exercise [8.4.1] r is the largest integer with 2r —1 < S; that is, r < oth 
Exercise [8.4.5] Look at (2/p). 

Exercise [8.7.2{a). Use the Chinese Remainder Theorem and exercise[8.1.2{b). 


Exercise If a is odd, then a=1+42:- a, and so 


ab—1 a-—1 b-1\_,, a-1,b-1 
142. 5 ab (142. 5 ) (142: 5 ) =142-( aaa =) (mod 4). 


Exercise [8.7.6] Select a? = —2 (mod p) with a odd and minimal, so that 1 < a < p—1. 
Write a? + 2 = pr. Evidently pr = a? + 2 = 3 (mod 8) and so r = 3p = 5 or7 (mod 8). 
But then a? = —2 (mod r) and so (=) = 1 with r = a t2 <p. This contradicts the 


induction hypothesis, and so (=) =-1. 


Exercise [8.8.1] Suppose that k > £> 1. If r is a quadratic residue mod p*, then r is a 
quadratic residue mod p*, trivially. On the other hand if r is a quadratic residue mod p', 
then it is a quadratic residue mod p‘t! by Proposition[8.8.1] then mod p‘t? by Proposition 
etc., up to mod p*. We take ¢ = 1 if p is odd, and ¢ = 3 if p = 2 and note that if r 
is a quadratic residue mod 8, then r = 1 (mod 8). 


Exercise [8.9.5{a). Write n = 3°m where 3{m. 
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Exercise [8.9.9[a). Consider the size of the set of residues {a (mod p)} and of the set of 
residues {m — b? (mod p)}, as a and b vary. 

(b) Take m = -1. 

(c) Prove there is a solution u, v to au” +bu? = —c (mod p) and then multiply through 
by any z (mod p). 
Exercise [8.9.10[e). Apply Gauss’s trick as in the proof of Corollary [7.5.2] 
Exercise[8.9.12] For each solution to y* = b (mod p), consider whether there are solutions 
to 2” =y (mod p). 
Exercise [8.9.14] Let b? = —1 (mod p) and study (1 +b)? (mod p). 
Exercise [8.9.15] Show that if a has order m (mod p), then oa,» consists of pot cycles of 
length m. 
Exercise [8.9.16/a). Use exercise[1.7.20{c). (b) Use exercise [L.7.20{b). 
Exercise[8.9.17] Select integer m with (m/n) = —1. Consider the prime divisors of integers 
of the form kn +m for well-chosen values of k. 
Exercise [8.9.18{a). Modify the ideas in Euclid’s proof that there are infinitely many 
primes. (b) n = —3. (c) Look at 4m? + 3 with m odd. (d) n = 3. Note (m? — 3)/2 = 2 
(mod 3). (e) n = —4. Note m? + 4=5 (mod 8). (f) n = 2. Note m? — 2 =7 (mod 8). 
(g) n = —2. Note m? + 2 =3 (mod 8). (h) n = —4 with (m,6) = 1. 
Exercise [8.9.24] Therefore (2) = (25) if n = 1 (mod 4), and (2) = (23) ifn=3 
(mod 4), and so the result follows by the induction hypothesis. 
Exercise [8.10.2] If N = pq +m where 0 < m < p—1, then N — p[N/p] = N — pq =m. If 
r > 0, then m =r; and ifr <0, then m=p+r. 


EXERCISES IN CHAPTER 9 
Exercise 1.2] If p does not divide a, then (b/a)? = —1 (mod p). Therefore p = 2 or 
p =1 (mod 4). We get the same conclusion if p does not divide b and, otherwise, p divides 
(a, b). 
Exercise By induction on k > 1: It is trivial for k = 1 and otherwise let nz = a? +b? 
and ny: np-1 = C+ A (by the induction hypothesis), and then the result follows from 
(9.1.1). 
Exercise [9.1.7(d). Use (a) to prove that |ac — bd|, |ad — be| < p. 
Exercise [9.3.1] Proceed as in the geometric proof of (6.1.1), or as in the proof of Propo- 
sition 9.1.2] 
Exercise b). Replace a and b by their absolutely least residues mod p. 
Exercise 9.7.3(b). Select any b with () = —1 in (a), and let m=r or s. 


Exercise We know that n is the length of the hypotenuse of a primitive Pythagorean 
triple iff there exist coprime integers r,s of different parity with n = r? + s?. Hence all of 
n’s prime factors are = 1 (mod 4), and we know we get at least two representations of n 
if it has at least two distinct prime factors. 

Exercise Since m? +2 are odd they must be = 3 (mod 4), and so must be divisible 
by a prime = 3 (mod 4). 

Exercise[9.7.10{a). In what domains do each of the ranges of ¢ lie? (b) We must be in the 
middle case (as y, z # 0) so that x = y in which case x(x + 4z) = p. Since p can only be 
factored in one way into positive integers, we have x = 1, z = pot, that is, v = (1,1, pt). 
(c) Pair up the elements of S' using ¢. 

Exercise [9.9.2] Try a=b=n=1. 


Hints for exercises 259 


EXXERCISES IN CHAPTER 10 


Exercise {10.3.2} Hopefully n = pq and ¢(n) = de — 1 = 29 x 197 — 1 = 5712; if so, then 
p+q=n+1—¢(n) = 180. Therefore (a — p)(2 — q) = x? — 1802 + 5891 which we factor 
to obtain p and q. 


Exercise [10.4.2(b). Use Corollary [7.5.3} 

Exercise [10.7.5] Since n is a Carmichael number we know that it is squarefree and has 
prime divisors p and g, by Lemmalf7.6.1] If a"~)/? = -1 (mod n), then let b = 1 (mod p) 
and b =a (mod q), and determine the value of b(°~!)/? (mod pq). 


Exercise [10.8.6(a). Factor 4¢* + 1 and substitute in 2 = 2”. 


EXXERCISES IN CHAPTER 11 


Exercise 11.2.1] If y = 0, then min = nim. Now (m,n) = (mi,ni) = 1 and som; =m 
and ni = n contradicting our construction of the pair m,n. 


Exercise [1.2.5] Consecutive powerful numbers of the form 27a? followed by b?, for some 
integers a and b. 


Exercise [11.4.2] Use the product rule to compute the derivative. 
Exercise[LL.6.3] Given a smallest solution to x? — dy? = 1 expand (a+ Vdy)®™ (mod d). 
Exercise [11.6.1i{c). Consider the example 1 + (2” — 1) = 2” with m > 2/e. 


EXERCISES IN CHAPTER 12 


Exercise[I2.1.3] Suppose that d is a fundamental discriminant and |[a, b, c] is an imprimitive 
form of discriminant d. If h|(a,b,c), then h?|d, so that h = 2. But then D = d/h? =0 or 
1 (mod 4), a contradiction. Now suppose that d is not a fundamental discriminant. Then 
there exists a prime p such that d = p?D, where D = 0 or 1 (mod 4). There is always a 
form g of discriminant D and so pg is an imprimitive form of discriminant d. 


Exercise [I2.1.4{c). Study the right-hand side of (12.1.2). 
Exercise [12.1.5| Take determinants of both sides. 


Exercise[12.1.6] First note that b=d mod 2, and that if b = 2k+6 with 6 the least residue 
of d (mod 2), then the change of variable x + x — ky shows that [1, b,c] ~ [1,6, A], the 
principal form. The value of A must be (6 — d)/4, so that the discriminant is d = b? — 4c. 


Exercise [2.4.1] One example is d = —171. We begin by noting that |b] <a < ./171/3 = 
V57 < 8 and b is odd. If b = +1, then ac = (1 + 171)/4 = 43 with a < cso that a = 1. If 
b = +38, then ac = (9+ 171)/4 = 45 with a < cso that a = 1,3,5 and 1 < |b). If b = +5, 
then ac = (25+ 171)/4 = 49 with a < c so that a = 1,7 and 1 < |b|. If b = +7, then 
ac = (49 + 171)/4 = 55 with a < c so that a = 1,5 which are both < |b], so we are left 
with [1, 1, 43], [3, 3, 15], [5,3,9], [5, —3,9], [7,5, 7], and [3,3,15] which is imprimitive. 
Exercise [12.4.2] These are the smallest negative fundamental discriminants of class num- 
bers 1 to 8: 

For d = —3 we have [1,1,1]. For d = —15 we have [1, 1, 4], [2, 1, 2]. 


For d = —23 we have [1, 1,6], [2, +1, 3]. 

For d = —39 we have [1, 1,10], [2,-+1, 5], [8,3, 4]. 

For d = —47 we have [1, 1, 12], [2, 1,6], [8,+1, 4]. 

For d = —87 we have [1, 1, 22], [2,-+1, 11], [3,3, 8], [4, +3, 6]. 

For d = —71 we have [1, 1,18], [2, +1, 9], [8,+1, 6], [4, +3, 5]. 

For d = —95 we have [1, 1, 24], [2, £1, 12], [3, +1, 8], [4,-£1, 6], [5,5, 6]. 

Exercise [12.4.3] These are the smallest even negative fundamental discriminants of class 
numbers 1 to 6: For d = —4 we have [1,0,1]; for d = —20 we have [1,0,5], [2, 2,3]; for 
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d = —56 we have [1,0,14], [2,0,7], [3,+2,5]; for d = —104 we have [1,0, 26], [2,0, 13], 
[3, +2, 9], [5,+4, 6]. 
Exercise [12.9.3} Use Rabinowicz’s criterion, and quadratic reciprocity. 


Exercise[12.6.1] Prove and use the inequality am?+bmn+en? > am? —|b| max{|m|, |n|}?+ 


cn? 2 


Exercise [I2.6.2[b). Use the smallest values properly represented by each form. 

Exercise [12.6.5{c). Use exercise [12.6.2{e). 

Exercise[12.6.71c). Given a solution B, let C = (B® —d)/4A and then [A, B, C] represents 
A properly (by (1,0)). Find reduced f ~ [A, B,C] and use the transformation matrix to 
find the representation as in (b). 

Exercise [2.8.1] Prove this one prime factor of A at a time and then use the Chinese 
Remainder Theorem. For each prime p, try f(1,0), f(0,1), and then f(1,1). 

Exercise [12.8.2] If f = [a,r,u], then the transformation x > x + ky,y > y yields that 
f ~ [a,b,c] where b = r + 2ka; that is, we can take b to be any value = r (mod 2a). 
Similarly if F = [A,s,v], then we can take b to be any value = s (mod 2A). Such a b 
exists by the Chinese Remainder Theorem provided r = s (mod 2), and r and s have the 
same parity as the discriminants of f and F. 
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